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Spider neurotoxins and method of producing the same 



The present invention relates to a novel toxin, and 
a method of producing a toxin, particularly but not 
exclusively to an insect specific neurotoxin 
^-Latroinsectotoxin ( «f - L I T ) , and a method of producing 



same . 



A family of high molecular weight neurotoxins has 
been found in the venom of the black widow spider 
(Latrodectus mactans Tredecimguttatus ) . Some of these 
toxins have been identified as being either vertebrate or 
invertebrate specific. -Latrotoxin («- LT ) and 
--Latroinsectotoxins (©< -LIT) are two such neurotoxins 
that have been characterised as being vertebrate and 
invertebrate specific respectively. The primary 
structures of these proteins have been determined, but 
characterisation of the structural features of the cloned 
toxins has not been po - ible due to the inability to 
aohieve functional expression of their genes. 

It is an object of the present invention to provide 
a novel toxin and a method of producing a toxin usually 
naturally produced by pos t - 1 r ans 1 a t i ona 1 modification of 
a precursor protein, using recombinant technology. 

According to the present invention there is 
Provided a polypeptide, such as a toxin, formed by 
expression of a truncated form of a gene sequence, or an 
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analogue thereof. 

Preferably the polypeptide is a neurotoxin and 
preferably corresponds to a toxic derivative of a 
substantially non-toxic precursor polypeptide encoded by 
the gene sequence. The polypeptide may comprise an amino 
acid sequence that corresponds to a truncated form of the 
amino acid sequence of a substantially non-toxic 
precursor polypeptide. Preferably the amino acid 
sequence of the polypeptide corresponds to the amino acid 
sequence of the precursor polypeptide with truncation 
thereof principally at the carboxy (C) end, and desirably 
by about 150 to 200 amino acids. The polypeptide amino 
acid sequence may in addition correspond to the precursor 
polypeptide amino acid sequence truncated at the amino 
end (N) preferably by less than 50 amino acids, and 
desirably by 7 or 28 amino acids. 

The amino acid sequence of the polypeptide may be 
homologous to the amino acid s?auence of the insect 
specific neurotoxin ^-Latroinsec to toxin (<f-LIT) or an 
active derivative thereof, and preferably comprises an 
amino acid sequence as shown in SEQIDN01 and SEQIDN02 or 
an active derivative thereof. Preferably the toxin is 
expressed from a nucleotide construct or. truncated form 
of the gene sequence comprising a sequence as shown in 
5EGIDN01, or active variants thereof. Preferably the 
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toxin is expressed from a sequence substantially as 
provided in a microorganism deposited at The National 
Collections of Industrial and Marine Bacteria Limited, 
under Accession No- NCIMB 40632. 

The invention also provides a protein for use as a 
toxin comprising an amino acid sequence substantially as 
shown in 5EQIDN01 and SEQ IDN02 , or an active derivative 
thereof. 

According to a further aspect of the present 
invention there is provided a nucleotide sequence 
comprising a truncated form of a gene sequence or an 
analogue thereof, for use in the expression of a 
polypeptide, such as a toxin. 

Preferably the nucleotide sequence corresponds to a 
gene encoding for a precursor polypeptide and truncated 
at the 3' end thereof or an active derivative thereof. 
Preferably the nucleotide sequence corresponds to the 
gene truncated by about 400 to 650 nucleotide bases, and 
desirably between 550 to 600 nucleotide bases. 

The nucleotide sequence may also correspond to the 
gene truncated at the 5 1 thereof, preferably by less than 
100 nucleotide bases, and desirably by either 84 or 21 
nucleotide bases. 
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Preferably the nucleotide sequence corresponds to 
part of a gene encoding for a neurotoxin in the venom of 
the Black Widow Spider (Latrodectus mactans 
T redec imgu t ta t us ) , or an active derivative thereof. 

The nucleotide sequence may correspond to part of 
the gene encoding the precursor polypeptide of insect 
specific toxin cT-Lac toinsec to toxin (J-LIT), or an 
active derivative thereof. The nucleotide sequence 
preferably codes for a polypeptide comprising a sequence 
of 991 amino acids. 

Preferably the nucleotide sequence comprises a base 
sequence as shown in SEQIDN01, or an active derivative 
thereof, and preferably as comprised in a microorganism 
deposited under Accession No. NCIMB 40632 at The 
National Collections of Industrial and Marine Bacteria 
Limited . 

Preferably the nucleotide sequence codes for a 
polypeptide having an amino acid sequence as shown in 
SEQIDN01 and SEQIDN02 , or an active derivative thereof. 

The nucleotide sequence may be a cDNA derived from 
mRNA by the use of an enzyme such as reverse 
transcriptase. The nucleotide sequence may alternatively 
be an oligonucleotide DNA construct produced perhaps 
using the polymerase chain reaction (PCR). 
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According to a further aspect of the present 
invention there is provided a method of producing a 
polypeptide, the method comprising producing a 
recombinant DNA molecule comprising a truncated form of a 
gene, and expressing the truncated form in a host 
expression system, such as a viral or bacterial 
expression system, to produce the polypeptide. 

Preferably the polypeptide produced is an active 
toxin and desirably a neurotoxin substantially as defined 
above. Preferably the truncated form comprises part of a 
gene which encodes for a non-toxic precursor polypeptide* 

Preferably the truncated form comprises a 
nucleotide sequence substantially as defined above, and 
as shown in SEQIDN01, or an active derivative thereof. 
Preferably the expression system comprises E . col i BL21 
(DE3) bacterial cells transformed with pT7-7 vectors 
comprising the truncated form of the sequence, desirably 
substantially as deposited under Accession No. NCIM8 
40632 at The National Collections of Industrial and 
Marine Bacteria Limited. The expression system may 
comprise a baculovirus system. 

In a further aspect of the present invention there 
is provided a recombinant DNA molecule, such as a virus, 
and in particular a baculovirus comprising a 
truncated form of a gene encoding for a toxin generally 
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as defined above, and substantially as provided in the 
microorganism deposited under Accession No. NC 1MB 40632. 

A still further aspect of the present invention 
provides an expression vector comprising a truncated form 
of a gene, the truncated form encoding for a toxin 
generally as defined above. 

The invention also provides a cell, such as a viral 
or bacterial cell transformed with a recombinant molecule 
as de fined above . 

There is also provided an insecticide comprising a 
toxin as defined above. The insecticide may be so as to 
be administered orally or topically. The insecticide may 
comprise a spray . 

This invention also provides an insecticide system 
comprising means for expressing a truncated form of a 
gene to produce a toxin as described above in an insect 
to kill or incapacitate the insect. The insecticide 
system may comprise a viral expression system, and 
desirably a baculovirus expression system. 

According to a further aspect there is provided a 
plant comprising a genetically modified cell containing a 
truncated form of a gene sequence substantially as 
defined above. 



Still further according to the present invention 
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there is provided a non-human animal comprising a 
genetically modified cell containing a truncated form of 
a gene sequence substantially as defined above. 



According to a further aspect of the present 
invention there is provided a toxin formed by processing 
of a substantially isolated non-toxic precursor 
polypeptide. 



The toxin is preferably a neurotoxin and is 
preferably formed by truncation toward the carboxy (C) 
end of the precursor polypeptide, preferably by 
site-directed mutagenesis. Desirably the toxin amino 
acid sequence generally corresponds to the amino acid 
sequence of the precursor polypeptide, truncated by 
between 150 and 200 amino acids. The toxin amino acid 
sequence may also be formed by truncation toward the 
amino (N) end of the precursor polypeptide amino acid 
sequence, the fragment cleaved therefrom preferably being 
significantly smaller than the fragment cleaved from the 
carboxy end, and may comprise 7 or 28 amino acids. 

Preferably the toxin has an amino acid sequence 
corresponding to polypeptide encoded by part of a gene of 
the Black Widow Spider (Latrodectus mactans 
T r e dec i mgu t t a t u s ) . The toxin may comprise or be an 
analogue of the insect specific neurotoxin 
<T -latroinsec totoxin (J -LIT), or an active derivative 
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thereof. 

Preferably the toxin comprises an amino acid 
sequence as shown in SEQIDN01 and 5EQIDN02 or an active 
derivative thereof. 

In a further aspect of the present invention there 
is provided a method of producing an active polypeptide 
from an inactive precursor polypeptide, the method 
comprising truncating the isolated precursor polypeptide. 

Preferably the isolated precursor polypeptide is 
truncated at the Carboxyl end, perhaps using proteolytic 
cleavage, and preferably by site directed mutagenesis. 
Truncation of the N terminus may also be provided. 
Preferably the active polypeptide is a toxin and is 
substantially as described above. 

According to another aspect of the present 
invention there is provided an isolated nucleotide 
base sequence encoding for a toxin precursor polypeptide 
as defined above and preferably with an amino acid 
sequence as shown in SEQIDN04 or an active derivative 
thereof. The base sequence preferably comprises the 
sequence shown in 5EQIDN03 or a derivative thereof. The 
nucleotide base sequence preferably encodes a precursor 
polypeptide of the neurotoxin /-Latroinsectotoxin 
( J-lit) . Preferably the base sequence is substantially 
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as provided in the microorganism deposited under 
Accession No, NCIMB 40633. 

In a further aspect there is provided a recombinant 
DNA molecule such as a virus, and more particularly a 
baculovirus comprising a sequence as defined in the 
preceding paragraph. 

In a still further aspect the invention provides a 
cell, such as a bacterial cell or viral cell, transformed 
with a recombinant DNA molecule as described in the 
preceding paragraph. 

This invention also provides an insecticide system 
comprising means for expressing a gene as described above 
to produce a precursor polypeptide as described above and 
to process the precursor polypeptide to produce a toxin 
in an insect to kill or incapacitate the insect. The 
insecticide system may comprise a viral expression 
system, and desirably a baculovirus expression system. 

According to a further aspect there is provided a 
plant comprising a genetically modified cell containing a 
gene as defined above. 

Still further according to the present invention 
there is provided a non-human animal comprising a gene- 
tically modified cell containing a gene as defined above. 
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Preferred embodiments of the present invention will 
now be described by way of example only, with reference 
to the accompanying sequences, in which:- 

SEQ ID NO. 1 shows the nucleotide base sequence and 
the corresponding amino acid sequence of a truncated form 
of a gene and a polypeptide encoded thereby, according to 
one aspect of the present invention; 

SEQ ID NO. 2 shows the polypeptide sequence of 
SEQIDN01 ; 

SEQ ID NO. 3 shows the nucleotide base sequence and 
the corresponding amino acid sequence of a gene and a 
polypeptide encoded thereby, according to another aspect 
of the present invention; and 

SEQ ID NO. 4 shows the polypeptide sequence of 
SEQIDN03. 

Referring to the sequences, a polypeptide such as a 
toxin as in SEQIDN02 is formed by expression of a 
truncated form of a gene sequence (SEQIDN01), or an 
analogue thereof. 

A toxin from Black Widow Spider (Latrodectus 
mactans Tredecimguttatus) venom (BWSV), 
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/-Latroinsectotoxin, (<T-LIT) has been purified and shown 
to possess insect specific toxicity. The ef-LIT 
structural gene has been cloned and sequenced and the N- 
and C terminii of the native (precursor) and functional 
protein toxin have been determined as described below. 
Site directed mutagenesis of cT-LIT cDNA enabled 
expression of the mature protein product (toxin) in 
bacteria, and this has been shown to be toxic to locusts. 



Expression and production of this and other such 
toxins in bacterial expression systems has hitherto not 
been possible. The invention includes identification of 
the sites for cleavage of the precursor protein to 
produce the toxin, and the precise site of truncation of 
the gene sequence which has enabled the toxin to be 
expressed in bacterial, and indeed other suitable hosts. 



Microorganism deposits have been made under the 
Budapest Treaty on 3rd May 1994, at the National 
Collections of Industrial and Marine Bacteria Limited, of 
23 St. Machar Drive, Aberdeen, Scotland, United Kingdom. 
Escherichia coli (XL-1 Blue pT7.<fM) cloned with the 
truncated form of the gene sequence is deposited under 
Accession No. 40632, and Escherichia coli (HMS 174 
pT7./FL) cloned with substantially the full gene sequence 
is deposited under Accession No. 40633). 



In more detail, the cDNA cloning and sequencing was 
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conducted as follows. Poly ( A+ ) -RNA was isolated from 
venom glands of the Black Widow Spider (Latrodectus 
mactans T redec imgut t a t us ) and a cDNA library constructed 
in the plasmid vector p5P65 (according to Kiyatkin et al, 
1993). A library of 6x10^ clones was screened with an 
end-labelled 23-mer oligonucleotide probe based on the 
N-terminal sequence of cf -LIT (amino acid residues 1-8)- 
5' GA(C/T)GA( A/G)GA( A/G ) G A ( C/T ) GG ( A/T)GAAATGAC 3 ! . 
Hybridization was performed. Positive clones were 
colony-purified and analysed by restriction mapping. The 
inserts were excised and fragmented by sonication as 
described (Sambrook et al , 1989) followed by cloning into 
the Smal site of pBluescript II SK+ and 5K- vectors 
(Stratagene, USA). Singl e - s t randed templates for 
sequencing were obtained after infection with helper 
phage VCS (Stratagene). The DNA sequences were 
determined by the chain-termination methoa (Sanger et al, 
1977) using Sequenase 2.0 version kit (USB Corporation) 
and T7 and T3 vector-specific primers (Stratagene). Each 
sequence was determined at least twice on both strands. 
Synthetic primers were used to sequence regions that were 
not covered by isolated subcloned fragments. 

DNA and protein sequence analysis was performed 
using the computer software DNASTAR (Onastar Inc) and 
PCGENE ( IntelliGenetics Inc). This work benefitted from 
the GCG programme mounted on the SERC Daresbury SEQNET 
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facility (Devereux, Haeberli and Smithies, (1984), 
Nucleic Acids Research 12(1); 387-395). 

The full-length cDNA construction was carried out 
as follows. Two sets of oligonucleotide primers were 
used to produce N- and C- overlapping parts of cf-LIT 
coding sequences by polymerase chain reaction. To 
facilitate subcloning into the expression vector the 5* 
sense primer (P1) ( TTG GGATCC GATGAAGAAGATGGAGAA) and 3' 
antisense primer (PS) ( C A A TG G TCGAC ACAG AAGGAATGGT A ) 
contained BamHI and Sail restriction enzyme sites. Two 
other primers -P9, sense ( GTCTGAACCATTTACTGTCC ) (position 
1283-1302) and P3, antisense ( GTAAGATTACCATCTGCAAC ) 
(complementary to position 2253-2272) were chosen to 
produce overlapping fragments with an internal Ncol 
(2056) restriction site. An oligonucleotide was designed 
to terminate the protein sequence after amino acid 991- 
5' CGTTTC GTCGAC TCATTCCGGTAAAGTACGACGAAA 3' . The 
polymerase chain reaction was performed using 1 unit of 
Taq-polymerase (Promega) under standard conditions (30 
cycles, 55°C for 1 min, 72°C for 3 min, 94°C for 1 min, 
with 100 pmol of each primer and 1-10 ng first-strand 
cDNA). In the first cycle the denaturation time was 
elongated to 5 min. The molecular mass of the amplified 
material was checked on an agarose gel. First-strand 
cDNA synthesis was carried out using First-strand cDNA 
Synthesis Kit (Pharmacia) with both random and specific 
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primers as recommended by the manufacturer. The PCR 
products were purified from agarose gel using GeneClean 
Kit (Bio 101 Inc.), digested with appropriate pairs of 
enzymes (BamHI and Ndel for the N-terminus part and 
Sall/Ndel for the C-terminus) and cloned into the pT7-7 
vector restricted with the similar pairs of enzymes. The 
full-length cDNA was created as a result of three-way 
ligation between N-terminal BamHI/Ndel-fragment, 
C-terminal Nde I / Sa 1 1 - f ra gmen t and pT7-7 BamHI/Sall- 
digested vector. The final construct had eight addi- 
tional amino acid residues at the amino terminal end 
(MARIRARG). All plasmid constructs were verified by 
sequencing from both ends and through the junction 
region. The full length construct was designated pT7k.FL 
a sample of which is deposited at the NCIMB, accession 
No 40633 and the truncated clone (1-991 amino acids) was 
designated pT7.£M. (NCIMB No 40632). 

In order to verify the identity of the S -LIT cDNA, 
this clone was expressed in the bacterial pT7-7 vector in 
E.coli BL2UJE3) cells. A full-length toxin cDNA 
(corresponding to Asp residue 29 to 1186) 1214 of 
SEQI0N04 was constructed and designated pT7.cs/" F L . The 
first 28 amino acids are believed to be present in the 
precursor polypeptide in spider venom- glands, but cleaved 
during N-terminal processing. The recombinant protein 
constitutes approximately 10 % of the total bacterial 
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lysate protein. A polyclonal antibody specific was 
raised to ^-LIT purified from spider venom glands, and 
demonstrated to be specific for the / toxin. This 
protein specifically detected a protein of 130 kDa in 
bacteria expressing recombinant full-length *f-LIT. 
Comparison of the molecular mass of the bacterially 
expressed full-length ^-LIT and the toxin purified from 
venom glands demonstrated a size difference of 
approximately 23kDa, in agreement with the calculated 
molecular mass. The full-length J* -LIT had no toxicity 
towards insects and is considered to be an inactive 
precursor form of the toxin. 

cS -LIT purified from venom glands was analysed by 
mass spectrometry (on a Kratos Kompact MALDI 3 Mass 
Spectrometer, using sinapinic acid as a matrix. The 
nitrogen laser excitation was at 337nm, and the positive 
ion was detected in the linear mode) yielding a prominent 
molecular ion with a m/z+ ratio of 110916. This 
corresponds closely to the expected molecular mass of ^ 
-LIT which is truncated at amino acid 991. By 
comparison, the bacterially expressed full length «-S~ -L I T 
yielded a molecular ion with an m/z-t- ratio of 1 3363 1 (VK, 
DRB, PNRU, Data not shown), within 100 Da of the 
calculated value. Site directed mutagenesis was used to 
create a novel<£-LIT C DNA clone (pT7<^M), which was 
truncated after amino acid 991 cf the JT-LIT sequence 
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(SEQIDN02). This protein was expressed in bacteria, 
yielding a protein of similar molecular mass to the 
mature toxin isolated from spider venom. 

E. coli BL21(DE3) cells transformed with pT7 clones 
were grown in LB medium containing 100mg ampicillin/ml at 
30°C to an A 60Qnm of approximately 0,5, Then 
expression was induced by addition of IPTG (tmM) to the 
medium, and incubation continued for 1 hour. For 
functional studies, bacteria were washed and resuspended 
in 50 mM TrisHCI, 100mM NaCl, 10mM KC1, 0.4?o Triton 
X-100, 12% (W/V) sucrose, 5mM DTT, 2 /cg/ml aprotonin, 2mM 
EDTA, pH8, and sonicated on ice. Ammonium sulphate was 
added to the cleared supernatant to a final concentration 
of 2 0 % of saturation, and the pellet was resuspended in 
buffer without DTT* These samples (5-15/dl) were used 
for thoracic injection into locusts (100-300 mg body 
weight); each test was performed on more than 4 locusts, 
and the locusts were examined for toxicity for 24 hours. 
Extracts from pT7-7 and pT7.£FL produced no effects on 
the locusts, but extracts fnm bacteria carrying pT7.&M 
caused rapid lethality. The time of death of the locusts 
varied from 5 minutes - 4 hours, depending on the potency 
of the batch of toxin. 

Preliminary studies were undertaken on 
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neuraliy-excited and resting retractor unguis 
nerve-muscle preparations isolated from me tathoracic legs 
of adult (male and female) locusts (Usherwood and 
Machili, 1968). «f-LIT was applied in standard locust 
saline (mM: NaCl, 180; KC1 , 10; CaC12, 2; Hepes, 10 (pH 
6.8)). A few studies were undertaken using saline in 
which CaC12 was omitted. Mechanical responses were 
recorded using a Grass strain guage connected to a Grass 
pen recorder. Recordings of miniature excitatory 
postsynaptic potentials were made from fibres of 
metathoracic extensor tibiae muscles of adult locusts 
(either sex) using intracellular microelec tr odes 
(approximately 1 OraTl resist ance ) . cT-LIT was applied in 
either standard locust saline, saline in which CaC12 was 
omitted or saline which contained MgC12 substituted for 
NaCl. The miniature potentials were recorded on video 
tape and analysed on a MassComp computer using in-house 
software. Membrane ' ilayers were formed at the tips of 
patch pipettes (diam 1-2/*m; fabricated from Clark 
Electromedical glass) from monolayers of either 
diphytanoyl phosphatidylcholine or a mixture of 9 parts 
isolectin and 1 part cholesterol using a pipette dipping 
technique (Montal and Muller, 198). Similar patch 
pipettes were used to excise membrane patches from locust 
metathoracic extensor tibiae muscle fibres (Huddie et 
al). In order to reduce the activities cf endogenous 
potassium channels KC1 was eliminated from the pipette 
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and bath salines . 

The neur ally-evoked twitch contraction of the 

locust retractor unguis muscle was reduced by 

approximately 4Q?o by 10~ 11 M <£-LIT (applied in 

standard saline) and was abolished during application of 
- 1 0 

10 M toxin. Small spontaneous contractions sometimes 
occurred during <=T-LIT application. The changes in 
twitch amplitude were accompanied by an irreversible 
muscle contracture. The appearance of the contracture 
was delayed and its amplitude was reduced when the 
concentration of £ -LIT was lowered. A muscle 
contracture also occured when toxin was applied when the 
muscle was not neurally stimulated. Twitch contractions 
do not occur in calcium-free saline and when 10"^M 
toxin was applied to a preparation equilibrated in this 
saline a contracture did not occur even after 30 min 
application of the toxin. 

When inside-out patches excised from locust muscle 
fibres were exposed to 10*** M <=« -LIT in the patch 
pipette, chanrel opening, of maximum conductance 
approximately 40p5, were observed. Channel openings 
of this type were never seen in the absence of toxin. 
The channel current exhibited inward rectification when 
the patch pipette and bath contained identical salines 
(including 2mM CaC12), and channel open times were longer 
at negative than at positive pipette potentials. When 
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there was a 10-fold Ca2 + gradient across a patch, the 
reversal potential of the channel current was + /- 1 5mV , 
the sign being dependent on the Ca2+ gradient. 

In the artificial bilayer studies where 10 M 
of -LIT was placed in the patch pipettes, single channel 
openings of approximately 30pS conductance were 
observed. These channels were not seen when toxin was 
omitted from the patch pipette. With identical salines 
(containing 2mM CaC12) in the patch pipette and bath, the 
cur rent -vol tage characteristic of the«x-LITx channel was 
sigmoidal with a reversal potential at OmV. The channel 
was shown to be Ca-selective by manipulating the ionic 
regimes of patch pipette and bath. 

A cDNA library from venom gland cDNA was screened 
with a 23-bp oligonucleotide probe corresponding to the 
N-terminal sequence of oT -LIT (as described above). To 
reduce the number of nucleotide ambiguities the codon 
usage data available from the nucleotide sequences of 
<=*-LT and «*-LIT cDNA (Kiyatkin et al, 1 990, Kiyatkin et 
al 1993) was referred to. Five positive cDNA clones were 
colony-purified and sequenced. The longest clone (pDT-1) 
contained more than 2 (kb) of ^T-LIT coding region. A 
PstI-3' fragment was used to rescreen the cDNA library to 
search for clones encoding the C-terminal part of the 
toxin. An additional cDNA clone, pDT-17, was isolated, 
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which covered the C-terminal coding region of the oT -LIT. 
Two overlapping clones, covering the entire open reading 
frame, have been sequenced in their entirety. The two 
clones have been demonstrated to be part of a single, 
continuous RNA from venom glands by polymerase chain 
reaction across the overlapping region, using two 
distinct sets of primers. The composite clones encode a 
cDNA with a frame of 3642 bp starting from the first 
in-frame Methionine and ending with T A A stop codon 
( SEQIDN03 ) . 

The Met residue is preceded by an in-frame stop 
codon confirming the full length of the deduced sequence. 

-LIT was purified to homogeneity from Black Widow 
Spider venom by three rounds of column chromatography 
according to (Krasnoperov et al, 1992). 23 amino acid 
residues of the N-terminal sequence of ©T -LIT was 
sequenced. The pure toxin was digested with trypsin and 
seven individual peptides were isolated and partially 
sequenced . 

Direct N-terminal sequence determination 
demonstrates that the mature protein starts from the 
sequence DEEDGEM..., so residue 1 in SEQIDN01 and 2 is 
the first Asp of this sequence. The deduced polypeptide 
starting from Asp ( + 1) consists of 1186 amino acid (as 
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shown in 5EQIDN03 , Asp residue 29 to residue 1214) 
residues with a predicted molecular mass of 132671 
Daltons and pi of 5.4. It contains all of the peptide 
sequences determined by amino acid sequencing analysis. 
There are two in-frame Met residues (-7 and -28) upstream 
of the N-terminus (as shown in 5EQIDN03) of the mature 
protein which can serve as translation initiation sites. 
The nucleotide sequence surrounding the ATG codon for Met 
(-7) correlates better with the classical Kozak consensus 
(Kozak, 1989), but the nucleotide arrangement for Met 
(-28) strongly corresponds to starting points for at 
least two other known proteins which have been isolated 
from arachnids: Major house dust mite allergen (AAAATGA) 
(Yuuki et al , 1991) and Low molecular weight protein 
co-purified with oC -Lat rot oxin (AAATGA) (Kiyatkin et al, 
1992). In both cases, the deduced sequence preceding the 
N-terminus of the mature protein does not correspond to 
classical signal peptide structures. We conclude that 
pos t- translat ional modification of cT-LIT N-terminus is 
limited to removal of 7 or 28 amino acid residues. The 
existence of a cluster of positive amino acid residues 
Arg-X-L y s- Arg (-1-4) which can serve as a potential 
endopept idase-c lea vage site supports the hypothesis that 
p o s t - t r ans 1 a t i ona 1 processing occurs at the N-terminus. 

Analysis of the deduced structure of oT-LIT with 
PEST (Rogers, 5. et al, 1986) reveals the presence of an 
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amino acid sequence enriched in P, E, 5 and T, which has 
previously been correlated with rapid degradation of 
intracellular prote ins (Gottesman & Maurizi, 1992). This 
region has the sequence EE5GAPEG5FDSPS5 , and is situated 
between residues 956-970. The presence of the 
PEST-region in the C-terminal part of J~-LIT is 
consistent with C-terminal processing of this protein. 

Computer analysis of £ -LIT predicts three putative 
transmembrane helixes t\.c of them situating in terminal 
regions (residues 39-67 and 221-240) and the third one of 
a minimal length (residues 580-595) being in the central 
region. The second putative transmembrane helix 
(residues 221-240) belongs to a very conservative region 
between all spider high molecular weight protein 
neurotoxins (Kiyatkin et al, 1993). 



Dot-matrix analysis of the predicted cT-LIT amino 
acid sequence revealed the presence of a repeated motif 
in the central part of the protein molecule. 460 amino 
acid residues of the J" -LIT primary structure comprise 
tandemly arranged imperfect copies of the ankyrin-like 
repeats (Michaely & Bennett, 1 992 ). Whereas ck LT and 

c*-LIT (Kiyatkin et ai, 1990, Kiyatkin et al, 1993) have 
no less than 20 repeats, £ -LIT has been found to have 
only 13 successive repeated units. Their optimal 
alignment is with phasing originally suggested in ^Lux et 
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al, 1990). The sequence of 13 amino acids which precede 
the first repeat can be viewed as a reduced repeated unit 
according to its good correlation with a consensus 
sequence. The majority of J" -LIT repeated units are 
33-34 amino acids in length, but two repeats contain 35 
(R1) and 36(R6) residues, respectively. 

Analysis with the PCOMPARE programme showed the 
linear correlation between the repeated units of two 
insect-specific toxins. Strong linear correspondence has 
been found for J" -LIT repeats R2-R9 in comparison to the 
analogous repeats in cx-LIT (Kiyatkin et ai, 1993). The 
first repeat in <f-LIT does not correspond well to the 
first one in o<_LIT and shows high similarity to R7 from 
cx-LIT. cT-LIT repeat RIO is most similar to R19 from 
<x-LIT: this repeat is unusual in having Ser and Gly 
residues at position 8 and 31, respectively. The next 
stretch of similarity is found between R11-R13 of J -LIT 
and R10-R12 of *.-LIT. We have noted that the R7, R2 and 
R9 repeats are the most highly conserved between the 
insec to tox ins , suggesting a functional role in 
insectotoxici ty . It has been shown that Erythrocyte 
Ankyrin repeats are not equivalent in respect of their 
functional ability to bind different proteins (Davis et 
al, 1991), and thus toxin repeats are also expected to 
make different contributions to their function. 
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Dot-matrix comparison of anc * cxr-LIT shows that 

they share a similar overall organization, with the 
strong central diagonal brok^.* once (between 900 and 1130 
amino acid residues of S-Lll) and restored for the 
last 160 amino acids of both toxins. The displacement of 
the central diagonal reflects the difference in toxin 
length; J**-LIT is 190 amino acids shorter than its 
insect-specific counterpart. 

The mature protein can be divided into several 
structural domains: an N-terminus consisting of about 470 
amino acid residues and possessing strong linear homology 
with ar-LIT; the central domain of about 430 amino acids 
almost completely comprised of tandemly arranged 
anky rin-like repeated units and a C-terminal domain of 
about 160 amino acids. 



Alignment of the insectotoxin protein sequences 
shows that both the N- and C-terminal structural domains 
demonstrate the presence of high identity regions 
separated by rather dissimilar sequences, with a hign 
level of identity (44.9?o for the N-terminal domain and 
37.1% for the C-terminus). The most dramatic changes in 
primary structure of the two i n s e c t o t o x i n s are 
concentrated in C-terminal parts of the repeat containing 
domains. A stretch of homology is localized to 13 
ankynn repeat units of J" -LIT. This region is followed 
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by a sequence of about 110 amino acid residues that has 
no obvious homology either withcx -LIT oro<LT nor with 
any other proteins from NBRF-PIR database. 
Interestingly, this domain, which is absent from £ -LIT , 
forms a specific region in the primary structure of 
<*<-LIT that has an unusual clustering of Cys-residues and 
possesses homology with mammal i an- spe c i fi c o< |_ T (Kiyatkin 
et al, 1993). So striking structural difference between 
the two insect-specific neurotoxins suggests that the 
C-terminal part of the ankyrin-like repeated domain plays 
a particular role in providing a structural basis for 
their different functional properties. 

The high molecular weight protein toxins from the 
venom of the Black Widow Spider are a potent and specific 
class of toxins. These toxins offer a great potential 
for elucidating the function of neural proteins, and for 
providing insect specific toxins. However this potential 
has not previously been realised due to the inability to 
express these protein toxins with any function. The 
present invention provides for the cloning of a novel DNA 
transcript encoding for a novel insect-specific toxin, 
and functional expression of this toxin, and other 
polypeptides in bacteria. 



The c£ 

based on the 



-LIT cDNA was cloned with an oligonucleotide 
sequence of amino acids 1-23 of the toxin, 



WO 95/29235 



PCT/GB95/00917 



- 26 - 



and its identity confirmed by additional peptide 

sequences, and immunochemical identity, using an antibody 

specific for the oT-LIT. The deduced primary structure 

of tS-LIT has considerable similarity to the mammalian 

specific ^LT and the insect-specific <x-LIT, suggesting 

that these toxins are part of a family with similar 

structure* The three proteins have a central domain 

which is composed of M ank y r i n - 1 i ke 11 repeats, with 13 

repeats in cS*-LIT. The ankyrin family of proteins couple 

spectrin to a variety of integral membrane proteins 

(Bennett, 1992), and it is believed that the "ankyrin 

repeat" domain of the ankyrins is responsible for 

specific binding to proteins (Davis and Bennett, 1990 J 

Biol Chem 265: 10589-10596; Davis et al (1991) J. Biol 

Chem 266: 1116 3-11169). This structural similarity with 

the ankyrin family is reflected in the known functional 

properties of the latrotoxins; o< LT is known to bind to a 

~ - 9 

receptor with high affinity (K d 10 M). It 
remains to be determined whether this specific binding to 
theo< LT receptor is mediated via the ankyrin repeat 
region of the toxin. 

Surprisingly , c£~ -L IT has no greater similarity to 
the insect -spec i f ic c<-LIT (38?o) than to the mammalian- 
specific c<LT {31%). Whereas <S -LIT has only 13 repeats, 
the o< LT and c<-LIT have 19 and 20 ankyrin repeats, 
respectively. The latter 6/7 repeats have no counterpart 
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in the JT-LIT, and may be a structural unit, as the <=< - 
toxins both contain 6 cysteine residues in this region, 
with partially conserved spacing. However, in view of 
the differences in target toxicity of thecal! and 
oC-LIT, it is not possible to identify this structural 
features with i ns e c t - s pe c i f i c toxicity. 

cS^-LIT exhibits a marked disparity between the 
molecular weight of the toxin, as deduced from the cDNA 
sequence, and the relative mobility of the pure toxin 
purified from venom. Whilst the N-terminus of the 
protein has been identified unambiguously by protein 
sequencing, the precise position of the C-terminus has 
been difficult to document. Expression of the full 
length cS" -LIT cDNA in bacteria demonstrated that the 
calculated molecular mass is accurately reflected in the 
relative mobility of the protein on SDS-PAGE, and that 
the natural venom derives predominantly, if not wholly 
from proteolytic, C-terminal processing. The fulx-length 
recombinant Drotein was purified, but was not toxic to 
locusts under any conditions. The full-length protein is 
an inactive precursor of the functional toxin. 



The precise site of the C-terminus of -LIT 
purified from venom was assessed by MALDI-mass 
spectrometry, which localised the site of cleavag? to 
amino acid 991 of the protein. The cDNA was mutated to 
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produce a protein of 991 amino acids with a sequence as 
shown in SEQIDN02, and expressed in bacteria. The mature 
recombinant protein was soluble and was lethal to 
locusts. Partial purification of the protein suggests 
that the toxin is highly toxic. 



Expression of the mature toxin from using the 
truncated form of the full gene sequence as described 
above has considerable advantages. Firstly, the toxin 
can be produced relatively easily by functional 
expression of the truncated form in a bacterial system, 
thereby obviating the need to purify toxin from venom 
glands of spiders. This enables industrial production of 
the toxin and hence commercial exploitation, for example 
as the major component of an insecticide system. 

Moreover, it presents possible administration 
systems for the toxin as an insecticide, beside 
conventional methods such as spraying. For example it 
may be possible to produce a modified plant cell or 
plant, such as a crop plant, containing a recombinant 
molecule incorporating the truncated sequence. Such a 
system comprises a recombinant baculovirus comprising the 
truncated form. Such viruses are highly infectious in 
vivo and resistant to inactivation in host cells, and are 
capable of high levels of expression of the inserted 
nucleotide sequence in host insect cells. This is 
expected to be harmless to the plant and indeed to 
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vertebrates. Upon ingestion of the plant tissue an 
insect will take in the recombinant molecule and/or 
toxin, resulting ultimately in the death of the insect. 
Since the toxin is insect specific, it is expected to 
have no detrimental effect to humans o: animals upon 
consumption. 

This is an example of one use of one embodiment of 
the invention to express a toxin that is usually produced 
by pos t- t ransla t ional modification of a precursor protein 
in biological systems, in a bacterial expression system. 
It is to be appreciated that the truncated form of other 
genes coding for other proteins could be expressed in 
this way, and fall within the scope of the present 
invention , 

The invention also provides toxin formed from the 
expression of a full, isolated gene to produce l 
precursor polypeptide which is then pos t - t r a n s 1 a t i ona 1 1 y 
modified. The precursor polypeptide has an amino acid 
sequence as shown in SEQIDNQ4, and the toxin has a 
sequence as shown in SEQIDN02. 

The isolated gene (5EQIDN03) (or an analogue) 
encoding for the precursor polypeptide of the toxin 
£ LIT can be cloned into a vector for expression of the 
precursor polypeptide. A baculovirus expression system 



WO 95/29235 



PCT/GB95/00917 



- 30 - 

can be used. The precursor polypeptide thus produced can 
then be truncated at the sites indicated above, by site 
directed mutagenesis, to produce an active toxin. This 
enables the toxin cfLIT or an active derivative thereof, 
to be produced independently of the Black Widow Spider, 
and thus on an industrial scale, for use as indicated as 
an insecticide • 

Whilst endeavouring in the foregoing specification 
to draw attention to those features of the invention 
believed to be of particular importance it should be 
understood that the Applicant claims protection in 
respect of any patentable feature or combination of 
features hereinbefore referred to and/or shown in the 
drawings whether or not particular emphasis has been 
placed thereon. 
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SEQUENCE LISTING 

O) GENERAL INFORMATION: 
(i ) APPLICANT: 

BRITISH TECHNOLOGY GROUP LIMIT-Q 
* "55 s7 : 101 NEWINGTON CAUSEWAY ^ 
CC) CITY: LONDON 

(E) COUNTRY: UNITED KINGDOM 

(F) POSTAL CODE (ZIP): SE1 6BU 

TIT TOX?N rNVE * TI ™: A ^VEL TOXIN AND A METHOD OF PRODUCING 
("Hi) NUMBER OF SEQUENCES: 4 

Civ) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

i*l S25! PUTER: ISM PC compatible 

°P ^*TING SYSTEM: PC-DOS /MS-DOS 

S0F ™AR£: Patentln Release £1.0, Version £1.30 (E?0) 

(2) INFORMATION FOR SEQ ID NO: 1: 

Ci) SEQUENCE CHARACTERISTICS: 

CA| ^GTH: 29T6 base pairs 
C3) TYPE: nucleic acid 
CC) STRANDEDNESS: double 
(C) TOPOLOGY: unknown 

Cii) MOLECULE TYPE: other nucleic acid 

CA) DESCRIPTION: /desc = "PLASM ID DNA" 

(vi) ORIGINAL SOURCE: 

CA) ORGANISM: LATRODECTUS MACTANS TRSD EC IMGUTTATUS 

Cvii) IMMEDIATE SOURCE: 

CB) CLONE: P T7.deltaM 

Cix) FEATURE: 

CA) NAME/KEY: COS 
C3) LOCATION : 1 . .2976 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 



1 : 



GAT GAA GAA GAT GGA GAA ATG ACT CTA GAA GAA AGA CAA GCA CAA TC.- 

Asp Glu Glu Asp Gl y Glu Met Thr Leu Glu Glu Arg JfJ Cys 

£0J Ala lT« It 6 ^ c GC ^ T I" GTT 777 GGG ATG ATC GCT 6*T GTA 

L-ys Ala lie Glu Tyr Ser Asn Ser Val Phe Gly Met: lie Ala Asp Val 

25 jo 

G ?I 1 AC GAC ATC GGT TCC ATT CCT G ~ A ATT GGC GAA GTA GTT GGC ATT 

~la Asn Aso lie Gly Ser He Pro Val lie Gly Glu VaY Val Gly Vu 

-3 40 45 

GTA ACT GCC CCA ATT GCC ATC GTA ACT CAC ATT ACT AGC GCA GGC TTG 

Val ,hr Aia Pre He Ala He Val Ser His n e Thr Ser Ala Gly l» 

- u 55 60 

G tl t" A ?? T T CT ACG QCA ~ A GAT TGT GAT GAT ATA CCT TTT GAT GAG 

" sc 1,e Aia s «r Thr Ala Leu Asp Cys Asp Asp lie Pro Phe Asp tlZ 
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GAC TTT AAA TCA TCA CTA ACA GGA GGA GAT GAC GGA TTA ATA GAT AAT 
Asp Phe Lys Ser Ser Leu Thr Gl y Gly Asp Asp Gly Leu lie Asp Asn 
275 280 285 



480 



528 



65 70 - 32 - 75 8Q 

ATT AAG GAA ATA TTA GAA GAA AGA TTC AAT GAA ATa r A - t~p 
He Lys Glu lie Leu Glu Glu Arg Phe A^n tft JT5 Asp ^ Leu 283 

8 - 90 95 

GAC AAG AAC ACA GCT GCT TTG GAA GAG GTC TCT AAA CTG GTA AGT AAA 

Asp Lys Asn Tnr Ala Ala Leu Glu Glu Val Ser Lys Leu Va? Ser Lys ° 

100 no 

£5 T Tp* ^ A £ G GT< ? GAA ACA AGG AAT GAA ATG AAC GAA AAT TTT no, 

Thr Phe Val Thr Val Glu Lys Thr Arg Asn Glu Met Asn Glu^ Tsn Phi 384 
'tS 120 125 

AAG CTT GTT TTG GAA ACT ATA GAA A GC AAA GAA ATA AAA TCA ATT GTA <*<x? 
Lys Leu Val Leu Glu Thr lie Glu Ser Lys Glu He Lys Ser He Val 
13G 135 140 

TTC AAA ATA AAT GAT TTT AAA AAG TTT TTT GAA AAA GAA CGA CAA AGA 
Phe Lys lie Asn Asp Phe Lys Lys Phe Phe Glu Lys Glu Arg Gin Ara 
14 ° ISO 155 16 0 

A J T AAA GGT ~S CCT AAA GAT AGG TAT GTT GCT AAG CTT CTA GAA CAA 
He Lys Gly Leu Pro Lys Asp Arg Tyr Val Ala Lys Leu Leu Glu Gin 
165 170 j 

AAA GGT ATT TTA GGT TCT TTA AAA GAA GTA AGA GAA CCA TCT GGA AAC c T r 
Lys Gly He Leu Gly Ser Leu Lys Glu Val Arg Glu PrU Ser GW ten 
180 185 19Q 

AGT CTG AGC TCC GCG TTA AAT GAA CTC TTA GAC AAA AAC AAC AAC TAT 
Ser Leu Ser Ser Ala Lau Asn Glu Leu Leu Asp Lys Asn Asn Asn Tyr 
19c 200 205 

GCC ATC CCA AAA GTG GTT GAT GAT AAT AAG GCC TTT CAG GCG CTG TAT . 672 
Ala lie Pro Lys Val Val Asp Asa Asn Lys Ala Phe Gin Ala Leu Tyr 
210 215 220 

GCT TTA TTT TAT GGA ACT CAG ACT TAT GCA GCC GTT ATG TTT TTC TTA 720 
Ala Leu Phe Tyr Gly Thr Gin Thr Tyr Ala Ala Val Met Phe Phe Leu 
22a 230 225 240 

CTC GAA CAA CAT TCT TAT CTG GCT GAT TAT TAT TAC CAA AAA GGT GAT 763 
Leu Glu Gin His Ser Tyr Leu Ala Asp Tyr Tyr Tyr Gin Lys Gly Asa 
245 250 255 

GAT GTA AAT TTT AAT GCA GAA TTT AAT AAT GTA GCA ATT ATT TTT GAT 816 
Asp Val Asn Phe Asn Ala Glu Phe Asn Asn Val Ala lie He Phe Asp 
260 265 270 



621 



864 



GTC A* i GAG G. i CTT AAC ACC GTG AAA GCA TTA CCA TTT ATA AAG AAC SI 2 

Vat He Glu Val Leu Asn Thr Val Lys Ala Leu Pro Phe He Lys Asn 

2S0 295 300 

GCC GAC AGT AAA CTA TAC AGA GAA TTA GTA ACT AGA ACA AAA OCT TTA 950 

Ala Asp Ser Lys Leu Tyr Arg Glu Leu Val Thr Arg Thr Lys Ala Leu 
305 210 * 215 320 

GAG ACT CTT AAA AAT CAA ATC AAA ACG ACT GAT TTG CCT CTT ATA GAT ".COS 

Glu Thr Lau Lys Asn Gin He Lvs Thr Thr As; Leu Pro Leu He Asp 

325 330 225 
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GAT ATA 
ASD lie 



CAA TTG 

Gin Leu 



TAG GCA 
Tyr Ala 
370 

TGG TCT 
Trp Ser 
385 

GTT CGT 
Val Arg 



AAC TCA 
Asn Ser 



AAT TTT 
Asn Phe 



AAT AAG 
Asn Lys 
450 

GCA GAC 
Ala Asp 

455 

GTT GCA 
Val Ala 



AAT CAA 
Asn Gin 



CTA CAC 
Leu His 



ATA AAT 
lie Asn 
530 

ACA CCA 
Thr Pro 
545 

AAT TTA 
Asn Leu 



GGA i i » 
Gly Phe 

GAT GCT 
Ass Ala 



CCC GAA ACT 
Pro Glu Thr 
3^0 

CCT ACA CCA 
Pre Thr Pro 
355 

GTA CAG TAT 
Val Gin Tyr 



GAA CCA TTT 
Glu Pro Phe 



GTT GAT CCG 
Val Asp Pro 
405 

GGA AAA CCT 
Gly Lys Pro 
420 

AAA GAT ATT 
Lys Asp lie 
435 

TTG AAA GCA 
Leu Lys Ala 



~G TCT CAA 
Leu Ser Gin 



ATA GGA AAT 
lie Gly Asn 
3S0 

GAA AGT AAG 
Glu Ser Lys 
375 

ACT GTC CAA 
Thr Val Gin 
390 

AAA AAG AGA 
Lys Lys Arg 



ATA GAA GCA 
He Glu Ala 



TAT CGA GGA 
Tyr Arg Glv 
485 

TCC ATT GAC 
Ser lie Asp 
500 

ATC GCA GCT 
lie Ala Ala 
515 

CAT HGA GCT 
His Gly Ala 



TTA CAT CTT 
Leu His Leu 



CTA GAA AGC 
Leu Glu Ser 
555 

ACA CCT TTG 
Thr Pro Leu 
530 

TTG CTA AAT 
Leu Leu Asn 



CAG TTT GCT 
Gin Phe Ala 



CAT CGT GAT 
His Arg Asp 

440 

GTG GAT GAA 
Val Aso Glu 
455 

AAA TTT GAC 
Lys Phe Asp 
470 

AAT AAC AAA 
Asn Asn Lys 



ATC GAG TTA 
lie Glu Leu 



GAA GCA GGT 
Glu Ala Gly 
520 

GAT GTG AAT 
Asp Val Asn 
535 

GCA ACA CGT 
Ala Thr Arg 
550 

CCA AAT ATT 
Pro Asn He 



CAT ACT GCA 
His Thr Ala 



CAT CCA GAC 
His Pre Asp 
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GTG AAC 
Val Asn 
345 

TGG GTT 
Trp Val 



GGC ATG 
Gly Met 



GGT AAC 
Gly Asn 



AAT AGA 
Asn Arg 
410 

GGA ACC 
Gly Thr 
425 

CTA TAG 
Leu Tyr 



GCT ACA 
Ala Thr 



AAT GAC 
Asn Asa 



ATA GCC 
II e Ala 
4S0 

AAA GAT 
Lys Asp 
505 

CAG GCA 
Gin Ala 



GCA AAA 
Ala Lys 



AGT GGA 
Ser Gly 



AAG GTA 
Lys Val 
570 

GTA ATG 
Val Me: 
585 

ATT GAT 
lie Asp 



Phe 



GAT 
Asp 



TAT 
Tyr 



GCT 
Ala 
395 

CTT 
Leu 



ATG 
Met 



GAT 
Asp 



ACT 
i hr 



CCG 
Pro 



GGC 
Gly 



TCG 
Ser 
330 

TGT 
Cys 
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A AT GAC GAA AAT 

Asn Asp Glu Asn 
3S0 

GTA GAA GTT AGG 

Val Glu Val Arg 
365 

AAA TTC AGT GAA 

Lys Phe Ser Glu 



AGA 
Arg 
475 

TTA 
Leu 



ATC 
lie 



ACT 
Thr 



GCA 
Ala 



TTG 
Leu 
460 

AGT 
Ser 



CCG ACT 
Pro Thr 



TTT AGG 
Phe Arg 



CAT TCA 
His Ser 
430 

GCC TTA 
Ala Leu 
445 

ATT GAA 
lie Glu 



ATA AAA 
He Lys 
400 

AAG TTC 
Lys Phe 
415 

CAA ACA 
Gin Thr 



AAT ATT 
Asn lie 



AAG GGT 
Lys Gly 



GCA ATG CAC GCA 
Ala Met His Ala 
430 

AGA TTT CTT TTG AAA 
Arg Phe Leu Leu Lys 
495 

AAC GGC TTT ACT CCT 
Asn Gly Phe Thr Pro 
510 

TTT GTT AAG TTA CTA 
Phe Val Lys Lsu Leu 
525 

ACA AGT AAG ACA AAT TTG 
Thr Ser Lys Thr Asn Leu 
54.0 



AAA 

Lys 



GGA 
Gly 



TCA AAA ACT GTA AGA 
Phe Ser Lys Thr Val Arg 
555 S60 

AAT GAA AAG GAG GAT GAC 
Asn Glu Lys Glu Asp Asa 
575 

AGT ACT TAT ATG GTT GTC . 
Ser Thr Tyr Met Val Val 
590 

AAA AAT GCG CAG TCT ACG 
Lys Asn Ala Gin Ser Thr 



1QS6 



1104 



1152 



1200 



1248 



1296 



1344 



1392 



1440 



1488 



1S36 



1584 



1632 



1680 



1728 



1 776 



1 82^ 
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595 60Q - 3* 



605 



IS 8S 12 S 15 SK Si 5ft SI ill SK «i SI Si ft - 

II - - - - 1; Si is s: s; Si i£ til SS aso '•«• 



640 



780 



i-I a? *** ATA GGA AGG AAA TCT ACA <*TA CTT TAC TTA TTA GAA 

Ser Ala Ala Lys He Gly Arg Lys Ser Thr Val Leu Tyr Leu Leu Glu 
735 790 795 800 

.AAA GGA GCT GAC ATT GGA GCT AAA ACA GCA GAC GGT TCT ACT GCC TTG 

Lys Gly Ala Asp He Gly Ala Lys Thr Ala Asp Gly Ser Thr Ala Leu 

805 810 815 

CAT TTA GCT. GTA TCT GGT CGT AAA ATG AAA ACT GTT GAA ACT CTA TTA 

Leu Ala Gly Arg Lys Met Lys Thr Val Glu Thr Leu Leu 

82 ° 825 830 

AAT AAA GGA GCA AAT TTA AAA GAA TAC GAT AAC AAT AAA TAT TTG CCA 

Asn Lys Gly Ala Asn Leu Lys Glu Tyr Asp Asn Asn Lys Tyr Leu Pro 

235 840 8^5 

ATA CAT AAA GCT ATT ATT AAT GAT GAC CTT GAC ATG GTA CGT TTG TTT 

-le His Lys Ala lie He Asn Asp Asp Leu Asp Met Val Arg Leu Phe 
8sG 855 860 



1963 



2016 



2064 



SIT Asn £?I Sir *f ^ T ^ CAT 777 GCA GCT TCA ATG GG ~ AGT ATT 
Val Asn His Met Ala Pro lie His Phe Ala Ala Ser Met Gly Ser lie 

645 650 655 

ftt mI- PIT ^ I AT CTC ATT TCC ATA AAA GTT AGT ATT AAT 

Lys Me. Leu Arg Tyr Leu He Ser lie Lys Asp Lys Val Ser lie Asn 
560 665 670 

Ser SI? ThI rtn ^ T ^ T MC TGG ACA CCT ™ CAT TTT GCT ATA TAT 
Ser Val Thr Glu Asn Asn Asn Trp Thr Pro Leu His Phe Ala lie Tyr 

680 gas 

TTT AAA AAA GAA GAT GCT GCA AAA GAA TTG TTG AAA CAA GAT GAP a-a 
Phe Lys Lys Glu Asp AT a Ala Lys Glu Leu Leu t£ tTn AsJ Asp m 2112 

69o 700 

SI Si £ til JIT ffi ■£ !S iK K *S SIT Si StI S SI »» 

710 715 720 

SIT 12 r? A rT A ^ T A T A ATT ^ GA A TTA TTG AAG AGA GGC 

Val Ser Thr Gly Gin He Asn He lie Lys Glu Leu Leu Lys Arg Gly 
725 730 735 

H°r ttl rT A GA A ^AA AAA ACT GGA GAA GGA TAT ACA TCT CTC CAC ATC 
Ser Asn He Glu Glu Lys Thr Gly Glu Gly Tyr Thr Ser Leu His I e 
/4 ° 745 750 

!fl A?» Si? ^ ^ G S* G CCA GAG ATA GCT GTT GTT T7 G ATT GAA AAC 
Ala Ala Met Arg Lys Glu Pro Glu lie Ala Val Val Leu He Glu Asn 
700 760 765 

Gil 2?I ?t£ rT A 2^ Q ? T CGA TCA GCT GAT AAT TTA ACA CCT TTA CAT 
Gly Ala Asp He Glu Ala Arg Ser Ala Asp Asn Leu Thr Pro Leu His 



2208 



22S6 



2304 



2352 



2400 



2448 



2496 



2544 



2592 
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C . , GAA AAA GAT QCC AGT CTC 



7 1 2, CCC AGT CTC AAA GAT GAT GAA ACA GAA GAG r<~ AGA 5c. 

Leu Glu Lys Asp p ro Ser Leu Lys A „ A „ ? Thr gTu G?* & A~ 

e ' u 575 880 

-kI I CA tT 7 ATG ~ A ATT GTT CAG AAA TTG CTT CTT GAA TTA Ta- aaC 2 c 3a 

• hr Ser n e Met Leu He Val Gin Lys Leu Leu Leu Glu Leu rtr Tsn 8 



885 890 

TAT TTT ATA AAT 



2880 



; M i 1' ' AAT AAT TAT G " GAA ACT TTG GAT GAA GAA GCT TTA T 27->« 

Tyr Phe 11. Asn Asn Tyr Ala Glu Thr Leu Aso Glu Glu Ali Leu Phe 
900 90S 9iq 

AAC CGC TTA GAT GAA CAA GGG AAA TTA GAG CTT GCA TAT ATC TTC CAT 2?a^ 
Asn Arg Leu Asp Glu Gin Gly Lys Leu Glu Leu Ala Tyr n e p fte 2734 
9T5 920 925 

AAT AAA GAA GGT GAT GCA AAA GAG GCT GTT AAG CCA ACT ATC CTT GTT 2*22 
Asn Lys Glu Gly Asp Ala Lys Glu Ala Val Lys Pro Thr He Leu Val 
930 935 9-10 

ACA ATT AAA CTT ATG GAA TAG TGC TTA AAA AAA CTT CGC GAA GAS TCT 
»hr He Lys Leu Met Glu Tyr Cys Leu Lys Lys Leu Arg Glu Glu Ser 

350 955 960 

5?* ff 7 GGT AGT 7TC SAT TCT CCA TCT TCA AAG CAA TGT ATT 2e->q 

Gly Ala Pra Glu Gly Ser Phe Asp Ser Pro Ser Ser Lys Gh Cys * e 
955 570 97s 

Zll IT 7 I" ^ G GAT ^ ATS T7T C3T CGT ACT TTA CC3 GAA TGA 2S7S 

Ser ihf Phe Ser Glu Asp Glu Met Phe Arg Arg Thr Leu Pro aft ■ 
9S0 9e5 990 

C2J INFORMATION rOR SEC ID NO: 2: 

Ci ) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 9S2 amino acids 
CE) TYPE: amino acit 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Asp Glu Glu Asp Gly Glu Met Thr Leu Glu Glu Arg Gin Ala Gin Cys 
1 S no 15 

Lys Ala He Glu Tyr Ser Asn Ser Val Phe Gly Met He Ala Asp Val 
20 2S 30 

Ala Asn Asp He Gly Ser lie Pro Val He Gly Glu Val Val Gly lie 
-5 AO *s 

Val Thr Ala Pro lie Ala lie Val Ser His He Thr Ser Ala Gly Leu 
50 55 go 

Aso lie Ala Ser Thr Ala Leu Asp Cys Asp Asp He Pro Phe Asp Glu 
0i 70 7S B0 

He Lys Glu He Leu Glu Glu Arg Phe Asn Glu He Asp Arg Lys Leu 
85 90 gs 

Aso Lys Asn Thr Ala Ala Leu Glu Glu Val Ser Lys Leu V a l Ser Lys 
" 00 ICS no 

Thr Pne Val Thr Val Glu Lys Thr Arg Asn Glu Met Asn Glu Asn Phe 
T IS 120 t 2S 
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Ly S Leu Val Leu Glu Thr lie Glu Ser Lys G1u lie Lys Ser He Val 
tJU • 140 

Phe Lys He Asn Asp Phe Lys Lys Phe Phe Glu Lys Glu Arg Gin Arg 

150 155 16 0 

lie Lys Gly Leu Pro Lys Asp Arg Tyr Val Ala Lys Leu Leu Glu Gin 
165 170 175 

Lys Gly He Leu Gly Ser Leu Lys Glu Val Arg Glu Pro Ser Gly Asn 
180 185 igo 

Ser Leu Ser Ser Ala Leu Asn Glu Leu Leu Asp Lys Asn Asn Asn Tyr 
195 200 205 

Ala lie Pro Lys Val Val Asp Asp Asn Lys Ala Phe Gin Ala L*u Tyr 
210 215 220 ~ 

Ala Le< Phe Tyr Gly Thr Gin Thr Tyr Ala Ala Val Met Phe Phe Leu 
225 230 235 240 

Leu Glu Gin His Ser Tyr Leu Ala Asp Tyr Tyr Tyr Gin Lys G^v Asn 
245 250 2Sc 

Asp Val Asn Phe Asn Ala Glu Phe Asn Asn Val Ala He He Phe Asa 
260 26S 270 

Asp Phe Lys Ser Ser Leu Thr Gly Gly Asp Asp Gly Leu lie Asp Asn 
275 280 285 

Val lie Glu Val Leu Asn Thr Val Lys Ala Leu Pro Phe lie Lys Asn 
290 295 300 

Ala Asp Ser Lys Leu Tyr Arg Glu Leu Val Thr Arg Thr Lys Ala Leu 
305 310 315 320 

Glu Thr Leu Lys Asn Gin lie Lys Thr Thr Asp Leu Pro Leu lie Asp 
325 330 335 

Asp He Pro Glu Thr Leu Ser Gin Val Asn Phe Pro Asn Asp Glu Asn 
340 34.5 350 

Gin Leu Pro Thr Pro He Gly Asn Trp Val Asp Gly Val Glu Val Arc 
355 360 365 

Tyr Ala Val Gin Tyr Glu Ser Lys Gly Met Tyr Ser Lys Phe Ser Glu 
370 375 380 

Trp Ser Glu Pro Phe Thr Val Gin Gly Asn Ala Cys Pro Thr He Lys 
385 390 395 4QQ 

Val Arg Val Asp Pro Lys Lys Arg Asn Arg Leu He Phe Arg Lys Phe 
405 410 415 

Asn Ser Gly Lys Pro Gin Phe Ala Gly Thr Met Thr His Ser Gin Thr 
420 425 430 

Asn Phe Lys Aso He His Arg Asp Leu Tyr Asp Ala Ala Leu Asn He 
435 440 445 

Asn Lys Leu Lys Ala Val Asp Glu Ala Thr Thr Leu He Glu Lys Gly 
450 455 450 

Ala Asp lie G I u Ala Lys Phe Asp Asn Asp Arg Ser Ala Met His Ala 
455 470 475 43Q 
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Val AT a Tyr Arg Gl y Asn Asn Lys He Al a Leu Arg Phe Leu Leu Ly 3 
435 490 495 

Asn Gin Ser He Asp lie Glu Leu Lys Asp Lys Asn Gly Phe Thr Pre 
500 . 505 ' S1Q 

Leu His He Ala Ala Glu Ala Gly Gin Ala Gly Phe Val Lys L*u Leu 
515 520 525 

lie Asn His Gly Ala Asp Val Asn Ala Lys Thr Ser Lys Thr Asn Leu 
530 535 54.0 

Thr Pro Leu His Leu Ala Thr Arg Ser Gly Phe Ser Lys Thr Yal Arg 

550 555 " ScQ 

Asn Leu Leu Glu Ser Pro Asn He Lys Val Asn Glu Lys Glu Asp Asp 
555 570 575 

Gly Phe Thr Pro Leu His Thr Ala Val Met Ser Thr Tyr Met Val Val 
580 5S5 59Q 

Asp Ala Leu Leu Asn His Pro Asp lie Asp Lys Asn Ala Gin Ser Thr 
595 600 . 605 

Ser Gly Leu Thr Pro Phe His Leu Ala He He Asn Glu Ser Gin Glu 
610 615 620 

Val Ala Glu Ser Leu Val Glu Ser Asn Ala Asp Leu Asn He Gin Asp 
625 630 635 640 

Val Asn His Met Ala Pro He His Phe Ala Ala Ser Met Gly Ser IT* 
645 650 65c 

Lys Met Leu Arg Tyr Leu He Ser He Lys Asp Lys Val Ser lie Asn 
650 655 STQ 

Ser Val Thr Glu Asn Asn Asn Trp Thr Pro Leu His Phe Ala lie Tyr 
675 630 635 

Phe Lys Lys Glu Asp Ala Ala Lys Glu Leu Leu Lys Gin Asp Asp He 
690 635 700 

Asn Leu Thr He Val Ala Asp Gly Asn Leu Thr Val Leu His Leu Ala 
7C5 710 715 720 

Val Ser Thr Gly Gin He Asn He He Lys Glu Leu Leu Lys Arg Gly 
725 730 735 

Ser Asn He Glu Glu Lys Thr Gly Glu Gly Tyr Thr Ser Leu His lie 
740 745 750 

Ala Ala Met Arg Lys Glu Pro Glu He Ala Val Val Leu lie Glu Asn 
755 760 755 

Gly Ala Asp He Glu Ala Arg Ser Ala Asp Asn Leu Thr Pro Leu His 
770 775 730 

Ser Ala Ala Lys lie Gly Arg Lys Ser Thr Val Leu Tyr Leu Leu G'u 
7=5 7SQ 795 8QC 

Lys Gly Ala Asp He Gly Ala Lys Thr Ala Asp Gly Ser Thr Ala Leu 
805 810 815 

His Lee Ala Val Ser Gly Arg Lvs Met Lys Thr Val Glu Thr Leu Le^ 

820 S25 eao 
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Asn Lys Gly Ala Asn Leu Lys Glu Tyr Asp Asn Asn Lys Tvr Leu Pro 
835 840 845 7 

lie His Lys Ala lie-He Asn Asp Asp Leu Asp Met Val Arg Leu Phe 
850 855 860 

Leu Glu Lys Asp Pro Ser Leu Lys Asp Asp Glu Thr Glu Glu GW Arg 
865 870 £75 88 0 

Thr Ser He Met Leu lie Val Gin Lys Leu Leu Leu Glu Leu Tyr Asn 
885 890 895 

Tyr Phe He Asn Asn Tyr Ala Glu Thr Leu Asp Glu Glu Ala Leu Phe 
900 905 910 

Asn Arg Leu Asp Glu Gin Gly Lys Leu Glu Lau Ala Tyr He Phe His 
915 920 925 



Asn Lys Glu Gly Asp Ala Lys Glu Ala Val Lys Pro Thr He Leu Val 
930 935 940 

Thr He Lys Lau Met Glu Tyr Cys Leu Lys Lys Leu Arg Glu Glu Ser 
94-5 950 955 960 

Gly Ala Pro Glu Gly Ser Phe 3*r Pro Ser Ser Lys Gin Cys He 

965 970 975 

Ser Thr Phe Ser Glu Asp Glu Met Phe Arg Arg Thr Leu Pro Glu * 
980 985 990 



(2) INFORMATION FOR SEC ID NO: 3: 

CO SECUENCS CHARACTERISTICS: 

(A) LENGTH: 3706 base pairs 
(S) TYPE: nucleic acid 
(C) STRAN0E0NES3: double 
(0) TOPOLOGY: circular 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = " PLASM I D DNA" 

(vi ) ORIGINAL SOURCE: 

(A) ORGANISM: LATROOECTUS MACTANS TREDSCIMGUTT ATUS 

(vii) IMMEDIATE SOURCE: 

CB) CLONE: pT7.deltarL 

(ix) FEATURE : 

(A) NAME/KEY: COS 

(8) LOCATION : 45 . .3686 



(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GGTCAATTGA AACTTTATGA TAGGATTCAC TTTCTTA7AT AGAA ATG CAT TCC AAA 

Met His Ser Lys 
995 

GAA TTA CAA ACT ATT TCA GCA GCG GTA GCA CGA AAA GCA GTA CCC AAT 
Glu Leu Gin Thr He Ser Ala Ala Val Ala Arg Lys Ala Val Pro Asn 
1000 IOCS 1010 

ACT ATG GTT ATT CGG TTG AAA AGA GAT GAA GAA GAT GGA GAA AT3 ACT 
Thr Met Val lie Arg Leu Lys Arg As? Glu Glu Asp Gly Glu Met Thr 
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1015 1020 1Q25 



CTA GAA GAA AGA CAA GCA CAA TGC AAA GCA ATA GAG TAC AGP AAT TCA 

Leu 6 u 61u Arg Gin Ala Gin Cys Lys Ala lie Glu Tyr $? r Ser 
1030 1035 1040 

GTT TTT GGG ATG ATC GCT GAT GTA GCT AAC GAC ATC GGT TCC ATT r 

Val Phe Gly Met He Ala Asp Val Ala Asn Asp He Gly Ser ill P 

1CU5 1050 1055 !, 

GTA ATT GGC GAA GTA GTT GGC ATT GTA ACT GCC CCA ATT GCC ATC GTA 

Val He Gly Glu Val Val Gly lie Val Thr Ala Pro lie Ala lie Val 



CCT 248 

Pro 

1060 



1065 1070 



1075 



AGT CAC ATT ACT AGC GCA GGC TTG GAT ATA GCT TCT ACG GCA TTA GAT iaa 

Ser His He Thr Ser Ala Gly Leu Asp lie Ala Ser Thr Ala Leu Asa 
1080 1085 1Q90 P 

TGT GAT GAT ATA CCT TTT GAT GAG ATT AAG GAA ATA TTA GAA GAA AGA 

Cys Asp Asp lie Pro Phe Asp Glu He Lys Glu He Leu Glu Glu Ara 
1095 1100 HQS 



392 



440 



i t C AAT GAA ATA GAT AGA AAG TTG GAC AAG AAC ACA GCT GCT TTG GAA 

Phe Asn Glu lie Asp Arg Lys Leu Asp Lys Asn Thr Ala Ala Leu Glu 
mo 111S 1120 

GAG GTC TCT AAA CTG GTA AGT AAA ACT TTT GTT ACG GTG GAA AAA ACA 48 fi 

Glu Val Ser Lys Leu Val Ser Lys Thr Phe Val Thr Val Glu Lys Thr 

1125 1130 1135 1140 

AGG AAT GAA ATG AAC GAA AAT TTT AAG CTT GTT TTG GAA ACT ATA GAA 536 

Arg Asn Glu Met Asn Glu Asn Phe Lys Leu Val Leu Glu Thr lie Glu " 
1145 nso ii55 

AGC AAA GAA ATA AAA TCA ATT GTA TTC AAA ATA AAT GAT TTT AAA AAG 584 

Ser Lys Glu He Lys Ser He Val Phe Lys He Asn Asp Phe Lys Lys 
1160 1155 1170 

TTT TTT GAA AAA GAA CGA CAA AGA ATT AAA GGT TTG CCT AAA GAT AGG 632 

Phe Phe Glu Lys Glu Arg Gin Arg He Lys Gly Leu Pro Lys Asp Arg 
1175 1180 1185 

TAT GTT GCT AAG CTT CTA GAA CAA AAA GGT ATT TTA GGT TCT TTA AAA 680 

Tyr Val Ala Lys Leu Leu Glu Gin Lys Gly He Leu Gly Ser Leu Lys 
1190 1195 1200 

GAA GTA AGA GAA CCA TCT GGA AAC AGT CTG AGC TCC GCG TTA AAT GAA 728 

Glu Val Arg Glu Pro Ser Gly Asn Ser Leu Ser Ser Ala Leu Asn Glu 

1205 1210 1215 1220 

CTC TTA GAC AAA AAC AAC AAC TAT GCC ATC CCA AAA GTG GTT GAT GAT 776 

Leu Leu Asp Lys Asn Asn Asn Tyr Ala He Pro Lys Val Val Asp Asp 
1225 1230 1235 

AAT AAG GCC TTT CAG GCG CTG TAT GCT TTA TTT TAT GGA ACT CAG ACT 82 4 

Asn Lys Ala Phe Gin Ala Leu Tyr Ala Leu Phe Tyr Gly Thr Gin Thr 
1240 1245 1250 

TAT GCA GCC GTT ATG TTT TTC TTA CTC GAA CAA CAT TCT TAT CTG GCT 872 

Tyr Ala Ala Val Met Phe Phe Leu Leu Glu Gin His Ser Tyr Leu Ala 
1255 1250 1265 

GAT TAT TAT TAC CAA AAA GGT GAT GAT GTA AAT TTT AAT GCA GAA TTT S20 

Asp Tyr Tyr Tyr Gin Lys Gly Asp Asp Val Asn Phe Asn Ala Glu Phe 
1270 1275 1280 



SUBSTITUTE SHEET iRULE 26) 



WO 95/29235 



PCT/GB95/00917 



40 



968 



1016 



1064 



T 1 1 2 



1160 



AAT AAT GTA GCA ATT ATT TTT GAT GAC TTT AAA TCA TCA r— aPa PGA 
^285 AU ^I 0 Phe A » Ph * ^S 8 S,P L ™ ^ So 

GGA GAT GAC GGA TTA ATA GAT AAT GTC ATT GAG GTT CTT AAC ACC GTG 
Gly Asp Asp Gly Leu lie Asp Asn Val He Glu Val Leu Tsn Thr Val 
1305 1310 1315 

AAA GCA TTA CCA TTT ATA AAG AAC GCC GAC AGT AAA CTA TAC AGA GAA 
Lys Ala Leu Pro Phe lie Lys Asn Ala • tp Ser Lys Leu Tyr i% §Tu 
1320 1325 T330 

TTA GTA ACT AGA ACA AAA GCT TTA GAG ACT CTT AAA AAT CAA ATC AAA 
Leu Val Thr Arg Thr Lys Ala Leu G1 u Thr Leu Lys Asn Gin He Lys 
1335 1340 1345 

ACQ ACT GAT TTG CCT CTT ATA GAT GAT ATA CCC GAA ACT TTG TCT CAA 
Thr Thr Asp Leu Pro Leu lie Asp Asp lie Pro Glu Thr Ul llr fift 
»«Q 1355 1360 

GTG AAC TTT CCG AAT GAC GAA AAT CAA ""G CCT ACA CCA ATA GGA AAT l?nfl 
Val Asn Phe Pro Asn Asp Glu Asn Gin Leu Pro Thr Pro lie Gly Jil ° 8 
1365 ^370 1373 ' 1380 

TGG GTT GAT GGC GTA GAA GTT AGG TAC GCA GTA CAG TAT GAA AGT AAG i 
Trp Val Asp Gly Val Glu Val Arg Tyr Ala Val Gin Tyr Glu Ser Lys 
1385 1390 1395 

GGC ATG TAT TCG AAA TTC AGT GAA TGG TCT GAA CCA TTT ACT GTC CAA 1304 
Gly Met Tyr Ser Lys Phe Ser Glu Trp Ser Glu Pro Phe Thr Val Gin 
1400 1405 1410 

GGT AAC GCT TGT CCG ACT ATA AAA GTT CGT GTT GAT CCG AAA AAG AGA 1352 
Gly Asn Ala Cys Pro Thr lie Lys Val Arg Val Asp Pro Lys Lys Arc 
1415 1420 1425 

AAT AGA CTT ATC TTT AGG AAG TTC AAC TCA GGA AAA CCT CAG TTT GCT 1400 
Asn Arg Leu lie Phe Arg Lys Phe Asn Ser Gly Lys Pro Gin Phe Ala 
1430 1435 1440 

GGA ACC ATG ACT CAT TCA CAA ACA AAT TTT AAA GAT ATT CAT CGT GAT 1443 
Gly Thr Met Thr His Ser Gin Thr Asn Phe Lys Asp lie His Arg Asp 
m$ 1450 1455 1460 

CTA TAC GAT GCA GCC TTA AAT ATT AAT AAG TTG AAA GCA GTG GAT GAA 149 6 

Leu Tyr Asp Ala Ala Leu Asn lie Asn Lys Leu Lys Ala Val Asp Glu 
1465 1470 1475 

GCT ACA ACT TTG ATT GAA AAG GGT GCA GAC ATA GAA GCA AAA TTT GAC 1544 
Ala Thr Thr Leu lie Glu Lys Gly Ala Asp lie Glu Ala Lys Phe Asp 
1480 1465 1490 

AAT GAC AGA AGT GCA ATG CAC GCA GTT GCA TAT CGA GGA AAT AAC AAA 1592 
Asn Asp Arg Ser Ala Met His Ala Val Ala Tyr Arg Gly Asn Asn Lys 
1495 1500 1505 

ATA GCC TTA AGA TTT CTT TTG AAA AAT CAA TCC ATT GAC ATC GAG TTA 1640 
lie Ala Leu Arg Phe Leu Leu Lys Asn Gin Ser lie Asp He Glu Leu 
1510 1515 1520 

AAA GAT AAA AAC GGC TTT ACT CCT CTA CAC ATC GCA GCT GAA GCA GGT 1688 
Lys Asp Lys Asn Gly Phe Thr Pro Leu His lie Ala Ala Glu Ala Gly 
1525 1530 1 525 1540 

CAG GCA GGA TTT GTT AAG TTA CTA ATA AAT CAT GGA GCT GAT GTG AAT 1736 
Gin Ala Gly Phe Val Lys Leu Leu lie Asn His Gly Ala Asp Val Asn 
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1545 



- u 1 



1550 



1555 



Alt lCs ~nr t GT f* 3 ^ CA AAT ~ G ACA CCA ~ A CA7 CTT GCA 'CA r GT 
Ala Lys ,nr Ser^ys Thr Asn Leu Thr = Pr= Leu His L eu ^ ?£ Zr 5 



1565 



1570 



AGT GGA , i , TCA AAA ACT GTA AGA AAT TTA CTA r: * * _ 

*" s " fs;.»«' & ^ is S & til 

1 - eo 1585 

f* 2 v T ? ? AT SK^ AAG GAG GAT GAC S£;; T7T ACA CCT 7TG r-T a— rr- 
L yS Va^Asn G1u Lys G1u S1y phe Thr C ? , GC- 

GTA ATG AGT ACT TAT ATG GTT ETC GAT GCT TTG CTA AAT CAT rra r.r 
VjJ^t Ser Thr T y r Me- Val Val Asp Ala Leu Leu Asn & £ J« 

lelU 151a 1620 

?TT f AT 7** AAT GCS CAG TCT ACS T CA GGA TTG ACT CCT TTC CAT ~A 
He Asp Lys Asn A a sin Ser Thr Ser Gly Leu Thr Pro Phe Hi s Le J 

1530 "625 

GCA Cj 7 Cj 7 AAT SAA AGT CAA GAA GTT GCA GAA TCT TTA GTG SAA a«— 
Ala ,le ,le Asn Glu Ser Gin Glu Val Ala Glu Ser Leu Val G?u J£ 
1<5 "* u 154.C g«Q 

AAT GCT GAT CTA AAT ATT CAG GAT GTT AAC CAT ATG GCT CCT ATT CAT 
Asn A,a A„ Leu Asn lie Gin Asp Val Asn His Met Ala Pro lU m, 
15 ~ s 1S60 ises ' 

ITT ??? GCT I CA ATG GGT AGT ATT AAA ATS CTT AGA TAT CTC ATT TCC 
rhe -J» Ser Met Gly Ser_:ie Lys Me-. Leu Ar; Tyr Leu He SeP 

,c "° 'O'S -.680 . 

ATA AAA GAT AAA GTT AGT ATT AAT TCT GTG ACT GAG AAT AAT AA~ TS- 
,1| Ly S Asp Ly 5 Val Ser II e Asn Ser Val Thr Slu Asn Asn Z£ fr, 

15-0 1535 1700 

t*Z2 pEI 7 TA ?T T i 77 GCT ATA TAT TTT AAA AAA GAA GAT GCT C-CA AAA 
■«i ?rc Leu His Phe Ala He Tyr Phe Lys Lys Glu Asp aU Alt E£ 

1710 17-jc y 

GAA TTG TTG AAA CAA GAT GAC ATA AAT TTA ACA ATT GTT GCA GAT G*T 
G.u Leu Leu Lys Gin Asp Asp lie Asn Leu Thr He Val AU a2p gTv 
1720 1725 1730 

AAT CTT ACC GTT TTA CAT CTT GCT GTT TCG ACA GGA CAA ATA AAT ATA 
Asn Leu Thr Val Leu His Leu Ala Val Ser Thr Gly Gin lie £n lie 
»'35 1740 17*5 

tTI ^tt T TA T 70 AAG AGA GGC TCC ^ ATA ^AA GAA AAA ACT GGA 

- le ^? n GiU Leu Leu L ^ s Ar S Gly Ser Asn He Glu Glu Lys Thr Gly 
1/50 175S 1760 

5?* I AT ACA TCT CTC CAC ATC GCT GCG ATG CGA AAG GAG CCA GAG 

GVu^Gly .yr »nr Ser Leu His He Ala Ala Me-, Ar 5 Lys Glu Pre Glu 
1 1 z ~ 1T70 1775 1730 

-T A ?t T GTT GTT 773 ATT GAA AAC GGT GCT GAC ATA GAA GCT CGA TCA 
xle .la val Val Leu He Glu Asn Gly Ala Asp n . g1u a1 a Ar - Ser 
I7c5 17S0 179c 

GCT GAT AAT TTA ACA CCT TTA CAT TCT GCC GCA AAA ATA GGA AGG AAA 
-la As? Asn Leu Tnr Pro Leu His Ser Ala A*,a Lys He Gly Arz Lvs 
TSO0 -sacs 1310 



T7 £ ^ 



iec2 



1380 



1928 



197S 



2024 



2072 



2120 



2168 



2216 



2264 



2312 



2360 



2AC8 



2 4c 6 
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2552 



2600 



2696 



274.4 



2792 



2840 



- 42 - 

III t£p Sit ?7u JZ A ^ Sf A ^ A GGA GCT GAC A TT GGA GCT AAA 

Ser Thr Va Leu Tyr Leu Leu Glu Lys Gly Ala Asp lie Gly Ala Lys 

1813 1820 18 2s 

xS A ? AC ^ T TCT ACT GCC 77(3 CAT T TA GCT GTA TCT GGT CGT AAA 

Thr Ala^Asp Gly Ser Thr Ala^Leu His Leu Ala Valuer Arg {JJ 

ATG AAA ACT GTT GAA ACT CTA TTA AAT AAA GGA GCA AAT TTA AAA GAA 2E . A 

Met Lys Thr Val Glu Thr Leu Leu Asn Lys Gly Ala Asn Leu Lys G?u ^ 

1845 ^850 1855 18 60 

rt~ a AT ^ C AAT ^ TAT TTG CCA ATA CAT AAA GCT ATT ATT AAT GAT 

Tyr Asp Asn Asn Lys Tyr Leu Pro lie His Lys Ala lie lie Asn Asp 

1865 1370 1875 

GAC CTT GAC ATG GTA CGT TTG TTT CTT GAA AAA GAT CCC AGT CTC AAA 

Asp Leu Asp Met Val Arg Leu Phe Leu Glu Lys Asp Pro Ser Leu Lys 

1880 1885 1890 

GAT GAT GAA ACA GAA GAG GGT AGA ACT TCA ATT ATG TTA ATT GTT CAG 

Asp Asp Glu Thr Glu Glu Gly Arg Thr Ser lie Met Leu lie Va GlS 

189o 19Q0 i 90 5 

AAA TTG CTT CTT GAA TTA TAT AAC TAT TTT ATA AAT AAT TAT GCT GAA 

yS ir«?„ LeU Leu Glu Leu Tyr Asn J y r Phe Ile Asn Asn Tyr Ala Glu 

1910 1915 1920 

thZ T 73 ? AT ^ 5^ GCT ~ A ~ C AAC CGC ~ A GAT GAA CAA GGG AAA 2888 

Thr Leu Asp Glu Glu Ala Leu Phe Asn Arg Leu Asp Glu Gin Gly Lys 

1925 1930 1935 1940 

TTA GAG CTT GCA TAT ATC TTC CAT AAT AAA GAA GGT GAT GCA AAA GAG 29 36 

Leu Glu Leu Ala Tyr He Phe His Asn Lys Glu Gly Asp Ala Lys Glu 

1945 i960 1955 

f?I w 7 ! f* G E CA ACT ATC CTT GTT ACA ATT AAA CTT ATG GAA TAC TGC 2984 

Ala Val Lys Pro Thr lie Leu Val Thr lie Lys Leu Met Glu Tyr Cys 

I960 1965 1970 

TTA AAA AAA CTT CGC GAA GAG TCT GGA GCT CCT GAA GGT AGT TTC GAT 3032 

Leu Lys Lys Leu Arg Glu Glu Ser Gly Ala Pro Glu Gly Ser Phe Asp 

T9 7 5 1980 1985 

TCT CCA TCT TCA AAG CAA TGT ATT TCT ACC TTT TCA GAG GAT GAA ATG 3080 

Ser Pro Ser Ser Lys Gin Cys lie Ser Thr Phe Ser Glu Asp Glu Met 

1990 iggs 2 000 

TTT CGT CGT ACT TTA CCG GAA ATT GTA AAA GAA ACG AAC AGC AGA TAT 3128 

■Phe Arg Arg Thr Leu Pro Glu lie Val Lys Glu Thr Asn Ser Arg Tyr 

2005 2010 2015 202" 

TTA CCA CTA AAG GGC TTT TCT CGC AGC CTA AAT AAG TTT CTC CCT TCT 3175 

Leu Pro Leu Lys Gly Phe Ser Arg Ser Leu Asn. Lys Phe Leu Pro Ser 

2025 2030 2035 

CTA AAA TTT GCC GAA AGT AAG AAT AGC TAC AGA TCT GAA AAT TTT GTT 32 ? 4. 

Leu Lys Phe Ala Glu Ser Lys Asn Ser Tyr Arg Ser Glu Asn Phe Val 

2040 2045 2050 

AGC AAT ATT GAT TCC AAC GGA GCA TTA CTT TTA CTC GAT GTA TTT ATC 32 72 

Ser Asn Ile Asp Ser Asn Gly Ala Leu Leu Leu Leu Asp Val Phe Il e 



2055 2060 2065 

AGA AAG TTT ACT AAT GAG AAA TAC AAT TTG ACT GGA AAA GAA GCT GTA 
Arg Lys Phe Thr Asn Glu Lys Tyr Asn Leu ihr Gly Lys Glu Ala Val 



3320 
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2070 2075 2080 



CCC TAT CTG GAA GCA AAG GCT TCA TCA TTA CGT ATC GCT TCT AAA TTT 
Pro Tyr Leu Glu Ala Lys Ala Ser Ser Leu Arc; lie Ala Ser £J J£ ~~ 63 
2085 2090 2095 



2100 



GAA GAA CTT CTA ACT GAA GTT AAA GGT ATT CCG GCT GGA GAG CTA ATT <Ul 6 

Glu Glu Leu Leu Thr Glu Val Lys Gly lie Pro Ala Gl y G?u Leu lie 



2105 2110 



2115 



AAT AiG GCC GAA GTG AGT TCC AAC ATA CAT AAG GCA ATT GCA AGT GGT 3464 

Asn Met Ala Glu Val Ser Ser Asn lie His Lys Ala He Ala Ser Gly 

2120 2125 2130 

AAG CCT GTA TCA AAA GTC TTA TGT TCG TAT TTG GAT ACC TTT TCT GAA 

Lys Pro Val Ser Lys Val Leu Cys Ser Tyr Leu Asp Thr Phe Ser Glu 

2135 2140 2145 

TTA AAT TCT CAA CAA ATG GAA GAA TTA GTT AAC ACA TAC TTA TCC ACC 

Leu Asn Ser Gin Gin Met Glu Glu Leu Val Asn Thr Tyr Leu Ser Thr 

2150 2155 2160 



3560 



AAA CCT TCT GTA ATT ACG TCA GCA TCT GCA GAT TAC CAG AAA CTT CCT 7 fin* 

Lys Pro Ser Val lie Thr Ser Ala Ser Ala Asp Tyr Gin Lys Leu Pro 
2165 2170 2175 2180 

AAT TTG TTA ACT GCA ACT TGC TTA GAA CCA GAA AGA ATG GCT CAA CTT 
Asn Leu Leu Thr Ala Thr Cys Leu Glu Pro Glu Arg Met Ala Gin Leu 
2185 2190 2195 

ATA GAT GTG CAT CAA AAG ATG TTT TTA CGT TAAAATACCA TTCCTTCTGT 
xle Asp Val His Gin Lys Met Phe Leu Arg 
2200 2205 



3656 



3706 



(2) INFORMATION FOR SEQ ID NO: 4: 

Ci) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1214 amino acids 
CB) TYPE: amino acid 
CO) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met His Ser Lys Glu Leu Gin Thr lie Ser Ala Ala Val Ala Arc Lys 
1 5 10 15 

Ala Val Pro Asn Thr Met Val lie Arg Leu Lys Arg Asp Glu Glu Asp 
20 25 30 

Gly Glu Met Thr Leu Glu Glu Arg Gin Ala Gin Cys Lys Ala lie Glu 
25 40 45 

Tyr Ser Asn Ser Val Phe Gly Met lie Ala Asp Val Ala Asn Asp lie 
50 55 60 

Gly Ser lie Pro Val lie Gly Glu Val Val Gly lie Val Thr Ala Pro 
55 7Q 75 eo 

lie Ala He Val Ser His lie Thr Ser Ala Gly Leu Asp He Ala Ser 
eS 90 95 

Thr Ala Leu Asp Cys Asp Asp He Pro Phe Asp Glu He Lys Glu He 
100 105 110 
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' Leu Glu Glu Arg Phe Asn Glu lie Asp Arg L ys Leu Asp Lys Asn Thr 
115 120 125 

Ala Ala Leu Glu Glu Val Ser Lys Leu Val Sen Lys Thr Phe Val Thr 
130 135 140 

Val Glu Lys Thr Arg Asn Glu Met Asn Glu Asn Phe Lys Leu Val Leu 
145 150 155 160 

Glu Thr lie Glu Ser Lys Glu lie Lys Ser lie Val Phe Lys lie Asn 
165 170 17S 

Asp Phe Lys Lys Phe Phe Glu Lys Glu Arg Gin Arg n e Lvs s1v Leu 
ISO 185 -j gQ y 

Pro Lys Asp Arg Tyr Val Ala Lys Leu Leu Glu Gin Lys Gly n e Leu 
195 200 205 

210 215 LSU SeP Ser 

Ala Leu Asn Glu Leu Leu Asp Lys Asn Asn Asn Tyr Ala lie Pro Lvs 
225 230 235 240 

Val Val Asp Asp Asn Lys Ala Phe Gin Ala Leu Tyr Ala Leu Phe Tvr 
245 250 255 

Gly Thr Gin Thr Tyr Ala Ala Val Met Phe Phe Leu Leu Glu Gin His 
250 265 270 

Ser Tyr Leu Ala Asp Tyr Tyr Tyr Gm Lys Gly Asp Aso Val Asn Phe 
275 280 285 

Asn Ala Glu Phe Asn Asn Val Ala lie lie Phe Asp Asc Phe Lys Ser 
290 295 200 

Ser Leu Thr Gly Gly Asp Asp Gly Leu lie Asp Asn Val lie Glu Val 
305 310 315 320 

Leu Asn Thr Val Lys Ala Leu Pro Phe He Lys Asn Ala Asp Ser Lys 
325 330 335 

Leu Tyr Arg Glu Leu Val Thr Arg Thr Lys Ala Leu Glu Thr Leu Lys 
340 345 350 

Asn Gin lie Lys Thr Thr Asp Leu Pro Leu He Asp Asp He Pro Glu 
355 360 365 

Thr Leu Ser Gin Val Asn Phe Pro Asn Asp Glu Asn Gin Leu Pro Thr 
370 375 380 

Pro He Gly Asn Trp Val Asp Gly Val Glu Val Arg Tyr Ala Val Gin 
385 390 395 400 

Tyr Glu Ser Lys Gly Met Tyr Ser Lys Phe Ser Glu Trp Ser Glu Pro 
405 410 415 

Phe Thr Val Gin Gly Asn Ala Cys Pro Thr He Lys Val Arg Val Asp 
420 425 43C 

Pro Lys Lys Arg Asn Arg Leu He Phe Arg Lys Phe Asn Ser Gly Lys 

43S 440 445 

Pro Gin Phe Ala Gly Thr Met Thr His Ser Gin Thr Asn Fne Lys Asp 
*5C 4=5 450 
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lie His Arg Asp Leu Tyr Asp Ala Ala Leu Asn n. Asn Lys Le u Lys 
Ala Val Asp Glu Ala Thr Thr Leu lie Glu Lys Gl y Ala Asp x1; 61u 

Ala Lys Phe Asp Asn Asp Arg S- , Ala Met His Ala Val Ala Tyr Arc, 

ouo 510 

Gly Asn Asn Lys He Ala Leu Arg Phe Leu Leu Lys Asn Gin Ser He 

520 S25 

Asp He Glu Leu Lys Asp Lys Asn Gly Phe Thr Pro Leu His He Ala 

qocs 540 

Ala Glu Ala Gly Gin Ala Gly Phe Val Lys Leu Leu He Asn His Gly 

5o5 560 

Ala Asp Val Asn Ala Lys Thr Ser Lys Thr Asn Leu Thr Pro Leu His 

0 570 575 

Leu Ala Thr Arg Ser Gly Phe Ser Lys Thr Val Arg Asn Leu Leu Glu 

585 5gg 

Ser Pro Asn He Lys Val Asn Glu Lys Glu Asp Asp Gly Phe Thr Pro 
S95 600 6o S 

Leu His Thr Ala Val Met Ser Thr Tyr Met Val Val Asp Ala Leu Leu 

Asn His Pro Asp He Asp Lys Asn Ala Gin Ser Thr Ser Gly Leu Thr 

530 635 640 

Pro Phe His Leu Ala lie He Asn Glu Ser Gin Glu Val Ala Glu Ser 
645 630 6S5 

Leu Val Glu Ser Asn Ala Asp Leu Asn lie Gin Asp Val Asn His Met 
600 655 670 

Ala Pro lie His Phe Ala Ala Ser Met Gly Ser He Lys Met Leu Arg 

680 685 

Tyr Leu He Ser lie L ys Asp Lys Val Ser He Asn Ser Val Thr Glu 
080 695 700 

Asn Asn Asn Trp Thr Pro Leu His Phe Ala He Tyr Phe Lys Lys Glu 

710 715 720 

Asp Ala Ala Lys Glu Leu Leu Lys Gin Asp Asp lie Asn Leu Thr lie 
725 730 735 

Val Ala Asp Gly Asn Leu Thr Val Leu His Leu Ala Val Ser Thr Glv 
740 745 750 

Gin lie Asn He lie Lys Glu Leu Leu Lys Arg Gly Ser Asn He Glu 
7SS 760 765 " 

Glu Lys Thr Gly Glu Gly Tyr Thr Ser Leu His He Ala Ala Met Ara 
77 ° 775 780 

Lys Glu Pro Glu lie Ala Val Val Leu lie Glu Asn Gly Ala Asp II 
1 ^ _ 790 *" ~ " 



795 



800 



Glu Ala Arg Ser Ala Asp Asn Leu Thr Pro Leu His Ser Ala Ala Lv<^ 
805 810 ei5 



C •' ; D rt ,rr i- 

^Jwo; , \ L-Tr SriEtT (RULE 26 1 
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He Gly Arg Lys Ser Thr Va^ Leu Ty~ Leu Leu Glu Lyg Ala Asp 

IT Gly Ala Lys Thr Ala AS p Gly Ser Thr Ala Leu His Leu Ala Val 

840 845 

Ser Gly Arg Lys Met Lys Thr Val Glu Thr Leu Leu Asn Lys Gly Ala 

Asn Leu Lys Glu Tyr Asp Asn Asn Lys Tyr Leu Pro n e His Lys Ala 

O'U 87C3 ggg 

He He Asn Asp Asp Leu Asp Met Val Arg Leu Phe Leu Glu Lys Aso 

885 890 895 

Pro Ser Leu Lys Aso Asp Glu Thr Glu Glu Gly Arg Thr Ser He Met 

300 gn.«5 giQ 



Leu lie Val Gin Lys Leu Leu Leu Glu Leu Tyr Asn Tyr Phe lie 



915 920 



925 



Asn 



Asn Tyr Ala Glu Thr Leu Asp Glu Glu Ala Leu Phe Asn Arg Leu Asp 

935 g4Q 

Glu Gin Gly Lys Leu Glu Leu Ala Tyr He Phe His Asn Lys Glu Gly 
945 930 955 gee 

Asp Ala Lys Glu Al^a Val Lys Pro Thr He Leu Val Thr He Lys Leu 
9oo qjq g75 

Met Glu Tyr Cys Leu Lys Lys Leu Arg Glu Glu Ser Gly Ala Pro' Glu 
980 985 990 

Gly Ser Phe Asp Ser Pro Ser Ser Lys Gin Cys He Ser Thr Phe Ser 
99s 1000 - t0 05 

Glu Asp Glu Met Phe Arg Arg Thr Leu Pro Gluille Val Lys Glu Thr 
1010 1015 '1020 

fo^ Ser Arg Tyr Leu Pro Leu L ^ S G1 y Phe s * r Arg Ser Leu Asn Lys 
1025 1030 1035 1040 

Phe Leu Pro Ser Leu Lys Phe Ala Glu Ser Lys Asn Ser Tyr Arg Ser 
1045 1050 1055 

Glu Asn Phe Val Ser Asn He Asp Ser Asn Gly Ala Leu Leu Leu Leu 
infJ 0 1065 1070 

Asp Val Phe He Arg Lys Phe Thr Asn Glu Lys Tyr Asn Leu Thr Gly 
1075 1080 1085 

Lys Glu Ala Val Pro Tyr Leu Glu Ala Lys Ala Ser Ser Leu Arg He 
1 090 1095 1 100 

Ala Ser Lys Phe Glu Glu Leu Leu Thr Glu Val Lys Gly He Pro Ala 
nos mo 1115 n 2 0 

Gly Glu Leu He Asn Met Ala Glu Val Ser Ser Asn He His Lys Ala 
1125 1130 1125 

He Ala Ser Gly Lys Pro Val Ser Lys Val Leu Cys Ser Tyr Leu Aso 
1140 1K5 * 1150 

Thr Phe Ser Glu Leu Asn Ser Gin Gin Met Glu Glu Leu Val Asn thr 
1 155 1 1 60 1 1 65 
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Tyr Leu Ser Thr Lys Pro Ser Val lie Thr Ser Ala Ser Ala Asp Tyr 
1170 1175 U80 

Gin Lys Leu Pro Asn Leu Leu Thr Ala Thr Cys Leu Glu Pro Glu Arg 
1135 1190 1195 1200 

Met Ala Gin Leu lie Asp Val His Gin Lys Met Phe Leu Ara 
1205 1210 
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CLAIMS 

1. A polypeptide, such as a toxin, formed by 
expression of a truncated form of a gene sequence, or an 
analogue thereof. 

2. A polypeptide as claimed in claim 1, in which the 
polypeptide is a neurotoxin. 

3. A polypeptide as claimed in any preceding claim, in 
which the polypeptide corresponds to a toxic derivative 
of a substantially non-toxic precursor polypeptide 
encoded by the gene sequence. 

4. A polypeptide as claimed in any preceding claim, in 
which the polypeptide comprises an amino acid sequence 
that corresponds to a truncated form of the amino acid 
sequence of a substantially non-toxic precursor 
polypeptide . 

5. A polypeptide as claimed in claim 4, in which the 
amino acid sequence of the polypeptide corresponds to the 
amino acid sequence of the precursor polypeptide with 
truncation thereof principally at the carboxy (C) end. 

6. A polypeptide as claimed in claim 5, in which 
truncation is by about 150 to 200 amino acids. 
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7. A polypeptide as claimed in any of claims 4 to 6, 
in which the polypeptide amino acid sequence in addition 
corresponds to the precursor polypeptide amino acid 
sequence truncated at tne amino end (N). 

8. A polypeptide as claimed in claim 7, in which the 
truncation is by less than 50 amino acids, and desirably 
by 7 or 28 amino acids. 

9. A polypeptide as claimed in any preceding claim, in 
which the amino acid sequence of the polypeptide is 
homologous to the amino acid sequence of the insect 
specific neurotoxin «f-Latroinsectotoxin (<f-L.IT) or an 
active derivative thereof. 

10. A polypeptide as claimed in any preceding claim, in 
which the polypeptide comprises an amino acid sequence as 
shown in SEQIDNO ^nd SEQTDN02 or an active derivative 
thereof . 

11. A polypeptide as claimed in any preceding claim, in 
which the toxin is expressed from a nucleotide construct 
or truncated form of a gene sequence comprising a 
sequence as shown in SEQIDND1, or active variants 
thereof. 

12. A polypeptide as claimed in any preceding claim, in 
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which the polypeptide is expressed from a sequence 
substantially as provided in a microorganism deposited at 
The National Collections of Industrial and Marine 
Bacteria Limited, under Accession No. NC 1MB 40632. 

13. A protein for use as a toxin comprising an amino 
acid sequence substantially as shown in SEQIDN01 and 
SEQIDN02. or an active derivative thereof. 

14. A nucleotide sequence comprising a truncated form 
of a gene sequence or an analogue thereof, for use in the 
expression of a polypeptide, such as a toxin. 

15. A nucleotide sequence as claimed in claim 14, in 
which the nucleotide sequence corresponds to a gene 
encoding for a precursor polypeptide and truncated at the 
3' end thereof, or an active derivative thereof. 

16. A nucleotide sequence as claimed in claim 15, in 
which the nucleotide sequence corresponds to the gene 
truncated by about 400 to 650 nucleotide bases, and 
desirably between 550 to 600 nucleotide bases. 



17. A nucleotide 
14 to 16 , in which 
the gene truncated 



sequence as claimed in any of claims 
the nucleotide sequence corresponds to 
at the 5' thereof. 
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18. A nucleotide sequence as claimed in claim 17, in 
which the truncation is by less than 100 nucleotide 
bases, and desirably by either 84 or 21 nucleotide bases, 

19. A nucleotide sequence as claimed in any of claims 
14 to 18, in which the nucleotide sequence corresponds to 
part of a gene encoding for a neurotoxin in the venom of 
the Black Widow Spider (Latrodectus mactans 
Tredecimgut tatus ) , or an active derivative thereof. 

20. A nucleotide sequence as claimed in claim 19, in 
which the nucleotide sequence corresponds to part of the 
gene encoding the precursor polypeptide of insect 
specific toxin J -L ac t o i n s e c t o t o x i n (J-LIT), or an 
active derivative thereof. 

21. A nucleotide sequence as claimed in any of claims 
14 to 20, in which the nucleotide sequence codes for a 
polypeptide comprising a sequence of 991 amino acids. 

22. A nucleotide sequence as claimed in any of claims 
14 to 21, in which the nucleotide sequence comprises a 
base sequence as shown in SEQIDN01, or an active 
derivative thereof. 

23. A nucleotide sequence as claimed in any of claims 
14 to 22, in which the nucleotide sequences comprises a 
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base sequence substantially as comprised in a micro- 
organism deposited under Accession No. NC 1MB 40632 at 
The National Collections of Industrial and Marine 
Bacteria Limited. 

24. A nucleotide sequence as claimed in any of claims 
14 to 23, in which the nucleotide sequence codes for a 
polypeptide having an amino acid sequence as shown in 
SEQIDN01 and 5EQIDN02, or an active derivative thereof. 

25. A nucleotide sequence as claimed in any of claims 
14 to 24, in which the nucleotide sequence is a cDNA 
derived from mRNA by the use of an enzyme such as reverse 
transcriptase . 

26. A nucleotide sequence as claimed in any of claims 
14 to 25, in which the nucleotide sequence is an 
oligonucleotide DNA construct produced perhaps using the 
polymerase chain reaction (PCR). 

27. A method of producing a polypeptide, the method 
comprising producing a recombinant DNA molecule 
comprising a truncated form of a gene, and expressing the 
truncated form in a host expression system, such as a 
viral or bacterial expression system, to produce the 
polypeptide . 



28. A method as claimed in claim 27, in which the 
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polypeptide produced is an active toxin substantially as 
claimed in any preceding claim. 



29. A method as claimed in claim 27 or claim 28, in 
which the truncated form comprises part of a gene which 
encodes for a non-toxic precursor polypeptide. 

30. A method as claimed in any of claims 27 to 29, in 
which the truncated form comprises a nucleotide sequence 
substantially as claimed in any of claims 14 to 26. 

31. A method as claimed in any of claims 27 to 30, in 
which the expression system comprises E . coli BL21 (DE3) 
bacterial cells transformed with pT7-7 vectors comprising 
the truncated form of the sequence. 

32. A method as claimed in any of claims 27 to 31, in 
which the expression system comprises a baculovirus 
system . 



33. A recombinant DNA molecule comprising a truncated 
form of a gene encoding for a toxin generally as claimed 
in any preceding claim. 

34. A recombinant DNA molecule as claimed in claim 33, 
in which the molecule comprises a virus. 



A recombinant DNA molecule as claimed in claim 34, 
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in which the molecule comprises a baculovirus. 

36. A recombinant DNA molecule substantially as 
provided in the microorganism deposited under Accession 
No, NCIMB 40632. 

37. An expression vector comprising a truncated form of 
a gene generally as claimed in any of claims 14 to 26. 

38. A cell, such as a viral or bacterial cell 
transformed with a recombinant molecule substantially as 
claimed in any of claims 33 to 37. 

39. An insecticide comprising a toxin substantially as 
claimed in any of claims 1 to 13. 

40. An insecticide as claimed in claim 39, in which the 
insecticide is so as to be administered orally or 
topically . 

41. An insecticide as claimed in claim 39 or claim 40, 
in which the insecticide comprises a spray. 

42. An insecticide system comprising means for 
expressing a truncated form of a gene to produce a toxin 
substantially as claimed in any preceding claim in an 
insect to kill or incapacitate the insect. 
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43. An insecticide system as claimed in claim 42, in 
which the insecticide system comprises a viral expression 
system . 

44. An insecticide system as claimed in claim 43, in 
which the viral expression system comprises a baculovirus 
expression system. 

45. A plant comprising a genetically modified cell 
containing a truncated form of a gene sequence 
substantially as claimed in any of claims 14 to 26. 

46. A non-human animal comprising a genetically modi- 
fied cell containing a truncated form of a gene sequence 
substantially as claimed in any of claims 14 to 26. 

47. A toxin formed by processing of a substantially 
isolated non-toxic precursor polypeptide. 

48. A toxin as claimed in claim 47, in which the toxin 
is formed by truncation toward the carboxy (C) end of the 
precursor polypeptide. 

49. A toxin as claimed in claim 48, in which the toxin 
amino acid sequence generally corresponds to the amino 
acid sequence of the precursor polypeptide, truncated by 
between 150 and 200 amino acids. 
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50. A toxin as claimed in any of claims 47 to 49, in 
which the toxin amino acid sequence is formed by 
truncation toward the amino ( N ) end of the precursor 
polypeptide amino acid sect .-.ce. 



51. A toxin as claimed in claim 5G\ in which the 
fragment cleaved from the amino end is significantly 
smaller than the fragment cleaved from the carboxy end. 



52. A toxin as claimed in claim 50 or claim 51, in 
which the fragment cleaved off comprises 7 or 28 amino 
acids . 



53. A toxin as claimed in any of claims 47 to 52, in 
which the toxin has an amino acid sequence corresponding 
to a polypeptide encoded by part of a gene of the Black 
Widow Spider (Latrodectus mactans Tredecimguttatus) . 

54. A toxin as claimed in any of claims 47 to 53, in 
which the toxin comprises or is an analogue of the insect 
specific neurotoxin cf-latroinsectotoxin (ef-LIT), or an 
active derivative thereof. 



55. A toxin as claimed in any of claims 47 to 54, in 
which the toxin comprises an amino acid sequence as shown 
in SEQIDN01 and SEQIDN02 or an active derivative thereof. 



56 . 



A method of producing an active polypeptide from an 
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isolated inactive precursor polypeptide, the method 
comprising truncating the isolated precursor polypeptid 

57. A method as claimed in claim 56, in which the 
isolated precursor polypeptide is truncated at the 
Carboxyl end. 



58. A method as claimed i 
which the truncation is eff 
cleavage, and preferably by 



n claim 56 or claim 57, in 
ected using proteolytic 
site directed mutagenesis. 



59. A method as claimed in any of claims 56 to 58, in 
which truncation of the N terminus may be provided. 



60. A method as claimed i 
the active polypeptide is a 
claimed in any of claims 1 



n claims 56 to 59, in which 

toxin and is substantially a 
to 13, 47 to 55. 



61 . An isolated n 
toxin precursor pol 
as shown in SEQIDNO 



ucleotide base se 
ypeptide with an 
4 or a derivative 



quence encoding for 
amino acid sequence 
thereof* 



62. An isolated base sequence comprising a base 
sequence as shown in SEQIDN03 or a derivative thereof. 



63. An isolated base sequence as claimed in any of 
claims 61 or 62, in which the nucleotide base sequence 
encodes a precursor polypeptide of the neurotoxin 
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«f-Lat roinsectotoxin ( </-LIT), 

64. An isolated base sequence substantially as provided 
in the microorganism deposited under Accession No. NC 1MB 
' 40633. 



65. A recombinant DNA molecule comprising a sequence 
substantially as claimed in any of claims 61 to 64. 

66. A recombinant molecule as claimed in claim 65, in 
which the molecule comprises a virus. 

67. A recombinant molecule as claimed in claim 66, in 
which the virus comprises a baculovirus. 

68. A cell, such as a bacterial or viral cell, 
transformed with a recombinant DNA molecule substantiall 
as claimed in any of claims 65 to 67. 



69. An insecticide system comprising means for 
expressing a base sequence substantially as claimed in 
any of claims 61 to 64 to produce a precursor polypeptide 
and to process the precursor polypeptide to produce a 
toxin in an insect to kill or incapacitate the insect. 

An insecticide system as claimed in claim 69, in 
the system comprises a viral expression system. 



70 . 
which 
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71. An insecticide system as claimed in claim 69, in 
which the viral expression system comprises baculovirus 



72. A plant comprising a genetically modified cell 
containing a nucleotide sequence substantially as claimed 
in any of claims 61 to 64. 

73. A non-human animal comprising a genetically 
modified cell containing a nucleotide sequence 
substantially as cl-iaed in any of claims 61 to 64. 

74. A novel toxin substantially as hereinbefore 
described with reference to SEQIDN01 and SEQIDN02. 

75. A nucleotide sequence substantially as hereinbefore 
described with reference to SEQIDN01 . 

76. An isolated polypeptide substantially as 
hereinbefore described with reference to SEQIDN03 and 
SEQIDN04. 

77. An isolated nucleotide sequence substantially as 
hereinbefore described with reference to SEQIDN03. 

78. Any novel subject matter or combination including 
novel subject matter disclosed, whether or not within the 
scope of or relating to the same invention as any of the 
preceding claims . 
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[received by the International Bureau on 14 September 1995 (14 09 95)- 
original claims 1-78 replaced by amended claims 1-74 (9 pages)'] ' 

1 . An active insect specific neurotoxin polypeptide, formed by expression of a 
truncated form of a gene sequence corresponding to part of a gene encoding for an insect 
specific neurotoxin present in the venom of the Black Widow Spider (Latrodectus mactans 

5 Tredecimguttatus), or an analogue thereof. 

2. A polypeptide as claimed in any preceding claim, in which the polypeptide 
corresponds to atoxic derivative of a substantially non-toxic precursor polypeptide encoded 
by the gene sequence. 



3. A polypeptide as claimed in any preceding claim, in which the polypeptide 
10 comprises an amino acid sequence that corresponds to a truncated form of the amino acid 

sequence of a substantially non-toxic precursor polypeptide. 

4. A polypeptide as claimed in claim 3, in which the amino acid sequence of the 
polypeptide corresponds to the amino acid sequence of the precursor polypeptide with 
truncation thereof principally at the carboxy (C) end. 

15 5. A polypeptide as claimed in claim 4, in which truncation is by about 150 to 
200 amino acids. 



6. A polypeptide as med in any of claims 3 to 5, in which the polypeptide amino 
acid sequence in addition corresponds to the precursor polypeptide amino acid sequence 
truncated at the amino end (N)- 

20 7. A polypeptide as claimed in claim 6, in which the truncation is by less than 
50 amino acids and desirably by 7 or 28 amino acids. 



8. A polypeptide as claimed in any preceding claim, in which the amino acid sequence 
of the polypeptide is homologous to the amino acid sequence of the insect specific 
neurotoxin 6-Latroinsectotoxin (6-LIT) or an active derivative thereof. 
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9. A polypeptide as claimed in any preceding claim, in which the polypeptide 
comprises an amino acid sequence as shown in SEQIDN01 and SEQIDN02 or an active 
derivative thereof. ■ 



10. A polypeptide as claimed in any preceding claim, in which the toxin is expressed 
from a nucleotide construct or truncated form of a gene sequence comprising a sequence 
as shown in SEQIDN01, or active variants thereof. 



11. A polypeptide as claimed in -y preceding claim, in which the polypeptide is 
expressed from a sequence substantially as provided in a microorganism deposited at 
The National Collections of Industrial and Marine Bacteria Limited, under Accession 
10 No. NCIMB 40632. 



12. A protein for use as a toxin comprising an amino acid sequence substantially as 
shown in SEQIDN01 and SEQIDN02, or an active derivative thereof. 

13. A nucleotide sequence comprising a truncated form of a gene sequence 
corresponding to part of a gene encoding for an insect specific neurotoxin present in the 

1 5 venom of the Black Widow Spider (Latrodectus mactans Tredecimguttatus) or an analogue 
thereof, for use in the expression of an active insect specific neurotoxin polypeptide. 

14. A nucleotide sequence as claimed in claim 13, in which the nucleotide sequence 
corresponds to a gene encoding for a precursor polypeptide and truncated at the 3' end 
thereof, or an active derivative thereof. 

20 15. A nucleotide sequence as claimed in claim 14, in which the nucleotide sequence 
corresponds to the gene truncated by about 400 to 650 nucleotide bases, and desirably 
between 550 to 600 nucleotide bases. 
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1 6. A nucleotide sequence as claimed in any of claims 1 3 to 1 5, in which the nucleotide 
sequence corresponds to the gene truncated at the 5' thereof. 

17. A nucleotide sequence as claimed in claim 16, in which the truncation is by less 
than 100 nucleotide bases, and desirably by either 84 or 21 nucleotide bases. 

5 18. A nucleotide sequence as claimed in any one of claims 13 to 17, in which the 
nucleotide sequence corresponds to part of the gene encoding the precursor polypeptide of 
insect specific toxin 6-Lactoinsectotoxin (6-LIT), or an active derivative thereof. 

1 9. A nucleotide sequence as claimed in any of claims 1 3 to 1 8, in which the nucleotide 
sequence codes for a polypeptide comprising a sequence of 991 amino acids. 

10 20, A nucleotide sequence as claimed in any of claims 1 3 to 1 9, in which the nucleotide 
sequence comprises a base sequence as shown in SEQIDN01, or an active derivative 
thereof 

21 . A nucleotide sequence as claimed in any of claims 13 to 20, in which the nucleotide 
sequence comprises a base sequence substantially as comprised in a microorganism 

1 5 deposited under Accession No. NCIMB 40632 at The National Collections of Industrial and 
Marine Bacteria Limited. 

22. A nucleotide sequence as claimed in any of claims 1 3 to 2 1 , in which the nucleotide 
sequence codes for a polypeptide having an amino acid sequence as shown in SEQIDN01 
and SEQIDN02, or an active derivative thereof 

20 23. A nucleotide sequence as claimed in any of claims 1 3 to 22, in which the nucleotide 
sequence is a cDNA derived from mRNA by the use of an enzyme such as reverse 
transcriptase. 
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24. A nucleotide sequence as claimed in any of claims 1 3 to 23, in which the nucleotide 
sequence is an oligonucleotide DNA construct produced perhaps using the polymerase 
chain reaction (PGR). 



5 



25. A method of producing a polypeptide as claimed in claim 1, the method comprising 
producing a recombinant DNA molecule comprising a truncated form of a gene as claimed 
in claim 13, and expressing the truncated form in a host expression system, such as a 
bacterial expression system, to produce the polypeptide. 

26. A method as claimed in claim 25, in which the polypeptide produced is an active 
toxin substantially as claimed in any preceding claim. 

10 27. A method as claimed in claim 25 or claim 26, in which the truncated form 
comprises part of a gene which encodes for a non-toxic precursor polypeptide. 

28. A method as claimed in any of claims 25 to 27, in which the truncated form 
comprises a nucleotide sequence substantially as claimed in any of claims 14 to 24. 

29. A method as claimed in any of claims 25 to 28, in which the expression system 
15 comprises £. cpJi BL21 (DE3) bacterial cells transformed with pT7-7 vectors comprising 

the truncated form of the sequence. 

30. A method as claimed in any of claims 25 to 29, in which the expression system 
comprises a baculovirus system. 

31. A recombinant DNA molecule comprising a truncated form of a gene encoding for 
20 a toxin generally as claimed in any preceding claim. 

32. A recombinant DNA molecule as claimed in claim 31, in which the molecule 
comprises a virus. 
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33. A recombinant DNA molecule as claimed in claim 32, in which the molecule 
comprises a baculovirus. 

34. A recombinant DNA molecule substantially as provided in the microorganism 
deposited under Accession No. NCIMB 40632. 

5 35. An expression vector comprising a truncated form of a gene generally as claimed 
in any of claims 14 to 24. 

36. A cell, such as a viral or bacterial cell transformed with a recombinant molecule 
substantially as claimed in any of claims 31 or 35. 

37. An insecticide comprising a toxin substantially as claimed in any of claims 1 to 12. 

10 38, An insecticide as claimed in claim 37, in which the insecticide is so as to be 
administered orally or topically. 

39. An insecticide as claimed in claim 37 or claim 38, in which the insecticide 
comprises a spray. 

40. An insecticide system comprising means for expressing a truncated form of a gene 
15 as claimed in any one of claims 13 to 24 to produce a toxin substantially as claimed in any 

preceding claim in an insect to kill or incapacitate the insect. 

41. An insecticide system as claimed in claim 40, in which the insecticide system 
comprises a viral expression system. 

42. An insecticide system as claimed in claim 41, in which the viral expression system 
20 comprises a baculovirus expression system. 
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43. A plant comprising a genetically modified cell containing a truncated form of a gene 
sequence substantially as claimed in any of claims 13 to 24. 

44. A non-human animal comprising a genetically modified cell containing a truncated 
form of a gene sequence substantially as claimed in any of claims 13 to 24. 

5 45. A toxin formed by processing of a substantially isolated non-toxic precursor 
polypeptide having an amino acid sequence corresponding to that of an insect specific 
neurotoxin present in the venom of the Black Widow Spider (Latrodectus mactans 
Tredecimguttatus). 

10 46. A toxin as claimed in claim 45, in which the toxin is formed by truncation toward 
the carboxy (C) end of the precursor polypeptide. 

47. A toxin as claimed in claim 46, in which the toxin amino acid sequence generally 
corresponds to the amino acid sequence of the precursor polypeptide, truncated by between 
1 50 and 200 amino acids. 



48. A toxin as claimed in any of claims 45 to 47, in which the toxin amino acid 
sequence is formed by truncation toward the amino (N) end of the precursor polypeptide 
amino acid sequence. 



49. A toxin as claimed in claim 48, in which the fragment cleaved from the amino end 
is significantly smaller than the fragment cleaved from the carboxy end. 

20 50. A toxin as claimed in claim 48 or claim 49, in which the fragment cleaved off 
comprises 7 or 28 amino acids. 



51. A toxin as claimed in any of claims 45 to 50, in which the toxin comprises or is an 
analogue of the insect specific neurotoxin 6-Latroinsectotoxin (6-LIT), or an active 
derivative thereof. 
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52. A toxin as claimed in any of claims 45 to 5 1 , in which the toxin comprises an amino 
acid sequence as shown in SEQIDN01 and SEQIDN02 or an active derivative thereof. 

53. A method of producing an active insect specific neurotoxin polypeptide from an 
isolated inactive precursor polypeptide having an amino acid sequence corresponding to 

5 that of an insect specific neurotoxin present in the venom of the Black Widow Spider 
(Latrodectus mactans Tredecimguttatus), the method comprising truncating the isolated 
precursor polypeptide. 

54. A method as claimed in claim 53, in which the isolated precursor polypeptide is 
truncated at the Carboxyl end. 

10 55. A method as claimed in claim 53 or claim 54, in which the truncation is effected 
using proteolytic cleavage, and preferably by site directed mutagenesis. 

56. A method as claimed in any of claims 53 to 55, in which truncation of the 
N terminus is provided. 

57. A method as claimed in claims 53 to 56, in which the active polypeptide is a toxin 
15 and is substantially as claimed in any of claims 1 to 12, 45 to 52. 

58. An isolated nucleotide base sequence encoding for a toxin precursor polypeptide 
with an amino acid sequence as shown in SEQIDN04 or a derivative thereof. 

59. An isolated base sequence comprising a base sequence as shown in SEQIDN03 or 
a derivative thereof 

20 60. An isolated base sequence as claimed in any of claims 58 or 59, in which the 
nucleotide base sequence encodes a precursor polypeptide of the neurotoxin 
6-Latroinsectotoxin (6-LIT). 



AMENDED SHEET (ARTICLE 19) 



W ° 95/29235 PCT/GB95/00917 

67 

61. An isolated base sequence substantially as provided in the microorganism deposited 
under Accession No. NCIMB 40633. 



- 62. A recombinant DNA molecule comprising a sequence substantially as claimed in 
any of claims 58 to 61. 

5 63 . A recombinant molecule as claimed in claim 62, in which the molecule comprises 
a virus. 

64. A recombinant molecule as claimed in claim 63, in which the virus comprises a 
baculo virus. 

65. A cell, such as a bacterial or viral cell, transformed with a recombinant DNA 
1 0 molecule substantially as claimed in any of claims 62 to 64. 

66. An insecticide system comprising means for expressing a base sequence 
substantially as claimed in any of claims 58 to 61 to produce a precursor polypeptide and 
to process the precursor polypeptide to produce a toxin in an insect to kill or incapacitate 
the insect. 

15 67. An insecticide system as claimed in claim 66, in which the system comprises a viral 
expression system. 

68. An insecticide system as claimed in claim 67, in which the viral expression system 
comprises baculovirus. 

69. A plant comprising a genetically modified cell containing a nucleotide sequence 
20 substantially as claimed in any of claims 58 to 61 . 

70. A non-human animal comprising a genetically modified cell containing a nucleotide 
sequence substantially as claimed in any of claims 58 to 61 . 
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71. A novel toxin substantially as hereinbefore described with reference to SEQIDNO 1 
and SEQIDN02. 



72. A nucleotide sequence substantially as hereinbefore described with reference to 
SEQIDNO 1. 

5 73. An isolated polypeptide substantially as hereinbefore described with reference to 
SEQIDN03 and SEQIDN04. 

74. A_i isolated nucleotide sequence substantially as hereinbefore described with 
reference to SEQIDN03. 
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